Genome sequence of Lysobacter dokdonensis DS-58T, a gliding bacterium isolated from soil in Dokdo, Korea

Lysobacter dokdonensis DS-58, belonging to the family Xanthomonadaceae, was isolated from a soil sample in Dokdo, Korea in 2011. Strain DS-58 is the type strain of L. dokdonensis. In this study, we determined the genome sequence to describe the genomic features including annotation information and COG functional categorization. The draft genome sequence consists of 25 contigs totaling 3,274,406 bp (67.24 % G + C) and contains 3,155 protein coding genes, 2 copies of ribosomal RNA operons, and 48 transfer RNA genes. Among the protein coding genes, 75.91 % of the genes were annotated with a putative function and 87.39 % of the genes were assigned to the COG category. In the genome of L. dokdonensis, a large number of genes associated with protein degradation and antibiotic resistance were detected.


Introduction
The genus Lysobacter was firstly described by Christensen and Cook in 1979 as high G + C Gram-negative bacterium with gliding motility [1]. In the past, Lysobacter species were classified as "unidentified myxobacters" due to their high G + C ratio and gliding motility. However, the genus Lysobacter has features distinctive from myxobacteria and had been proposed as a new genus of Gammaproteobacteria. Lysobacter species are ubiquitous and have been found in a variety of environments such as soil, water, and the rhizosphere. Currently, more than 30 Lysobacter species were registered in the GenBank taxonomy database and among them, 28 species have been validly published [2]. Some of the Lysobacter species were known to produce several kinds of lytic enzymes and antibiotics [3] and have an antimicrobial activity against plant pathogens [4]. Moreover, several Lysobacter species are known to produce bioactive natural products such as cyclodepsipeptide, cyclic lipodepsipeptide, cephem-type β-lactam, and polycyclic tetramate macrolactam [5]. Despite their ubiquitous distribution, many identified species, and possible usefulness as a biocontrol agent, deciphered Lysobacter genomes are relatively limited. Here, we present the genome sequence and the genomic information of Lysobacter dokdonensis DS-58 T (KCTC 12822 T = DSM1 7958 T ), which is the type strain of the species.

Organism information
Classification and features L. dokdonensis DS-58 T is a Gram-staining-negative, nonmotile, and rod-shaped bacterium and was isolated from the soil sample in Dokdo, an island in the East Sea, Korea, in 2011 [6]. L. dokdonensis DS-58 grows at the temperature range of 4 to 38°C, the pH range of 6.0 to 8.0, and the NaCl concentration of 0 to 0.5 % (w/v) [6]. Colony size of L. dokdonensis DS-58 is about 1.0 -2.0 mm on nutrient agar medium and the cell size is 1.0-5.0 μm long and 0.4-0.8 μm wide [6] (Fig. 1). L. dokdonensis DS-58 can assimilate dextrin, Tween 40, maltose, α-ketobutyric acid, alaninamide, L-alanine, L-alanyl glycine, and L-glutamic acid as a carbon source [6]. Minimum information about a genome sequence (MIGS) for L. dokdonensis DS-58 is described in Table 1. Phylogenetically, L. dokdonensis DS-58 belongs to the family Xanthomonadaceae of the class Gammaproteobacteria, and the 16S rRNA gene showed the highest sequence similarity (96.93 %) with L. niastensis GH41-7. However, a phylogenetic tree based on the 16S rRNA gene showed that the strain DS-58 is located in the deep branch of the genus Lysobacter (Fig. 2).

Genome project history
The genome sequencing and analysis of L. dokdonensis DS-58 were performed by the Laboratory of Microbial Genomics and Systems/Synthetic Biology at Yonsei University using the next generation sequencing. The genomic information was deposited in the GenBank (Accession number is JRKJ00000000). Summary of the genome project is provided in Table 2.

Genome sequencing and assembly
For the whole genome shotgun sequencing, a library with 500-bp insert size was prepared and paired-end genome sequencing was performed with HiSeq2000 of  , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [32] the Illumina/Solexa platform (Macrogen, Inc., South Korea). Sequence trimming was conducted using CLC Genomics Workbench 5.1 (CLC bio, Qiagen, Netherlands) with parameters of 0.01 quality score and none of the ambiguous nucleotide. Sequence reads below 60 bp in length were discarded. After trimming, a total of 28,810,330 reads with an average read length of 95.8 bp were generated. De novo assembly was performed with CLC Genomics Workbench with parameters of automatic word and bubble size, deletion and insertion cost of 3, mismatch cost of 2, similarity fraction of 1.0, length fraction of 0.5, and minimum contig length of 500 bp. After the de novo assembly, scaffolding was performed using SSPACE [8] and automatic gap filling was carried out with IMAGE [9]. Following the automatic gap filling, manual gap filling was conducted using CLC Genomics Workbench with the function of Find Broken Pair Mates in the end of the contigs. Basic information of the genome sequencing project is described in Table 2.

Genome annotation
Structural gene prediction was conducted using Glimmer 3 [10] in RAST server [11] with automatic fixation of errors and frame shifts. Functional assignment of the predicted protein coding sequences (CDSs) was performed using AutoFact [12] with the results of BLASTP or RPS-BLAST with Uniref100, NR, COG, and Pfam databases. For the accurate annotation, the functional assignment results from the RAST server and BLAST were compared each other. When assignment of the gene function was not the same between the results from RAST and BLAST, an additional BLASTP search was performed with NR database at NCBI and the top-hit result was selected for the annotation.

Genome properties
The draft genome sequence of the strain DS-58 consists of 25 contigs and the sum of the contigs is 3,274,406 bp (G + C content 67.24 %) ( Table 3 and Fig. 3). From the genome of the strain DS-58, 3,155 CDSs, 2 copies of ribosomal RNA operons, and 48 transfer RNAs were detected. Among the predicted CDSs, 2,436 CDSs were annotated with a putative function and 2,757 CDSs were   assigned to a COG category. The numbers and percentages of COG assigned genes are shown in Table 4.

Insights from the genome sequence
Some Lysobacter species are known to produce the secondary metabolite with antimicrobial activities [13,14].
In the genome of L. dokdonensis DS-58, biosynthetic gene clusters for a bacteriocin and an arylpolyene were detected. The structure of bacteriocin-biosynthetic gene cluster of DS-58 was similar to the one in L. arseniciresistens ZS79 and the structure of arylpolyene-biosynthetic gene cluster was similar to the one in Xanthomonas campestris NCPPB 4392 (Fig. 4).
In the genome of L. dokdonensis DS-58, a number of genes associated with proteolysis were detected that include 63 genes encoding peptidases and 33 genes encoding proteases. Microbial proteases are among the most important industrial enzymes due to their diverse activities and the genus Bacillus is major source of protease in the market [15,16]. Results from the text mining of annotated gene products indicated that L. dokdonensis DS-58 has more genes encoding proteases and peptidases than other genome-sequenced Lysobacter species except for L. antibioticus ASM73109v1 and L. capsici AZ78. Moreover, in the genome of the strain DS-58, genes encoding 17 β-lactamases for degrading chemicals such as β-lactam antibiotics, biotin-biosynthetic proteins, and type IV fimbrial biogenesis proteins that could be involved in gliding motility were detected.
Distinct from other genera in the Xanthomonadaceae, Lysobacter spp. exhibit gliding motility [1]. Type IV piliassociated bacterial motility is widespread in members of diverse taxa such as Proteobacteria, Bacteroidetes, and Fibrobacteres [17] and known to be responsible for Smotility in Myxococcus and twitching motility in Lysobacter [18] as well as Pseudomonas and Neisseria [19]. Thus, there is a possibility that the gliding motility of Lysobacter is associated with type IV fimbriae. On the other hand, GltA, which is involved in A-motility of Myxococcus xanthus that best fits the definition of gliding motility [20], was detected in the genome of DS-58 (56 % identity with 88 % coverage).
Lysobacter species typically have been isolated from soil and water, but several studies indicated that Lysobacter species may survive in more diverse habitats of anaerobic or extreme-cold [21,22]. A great diversity of secreted degrading enzymes such as proteases and ß-lactamases may contribute to the adaptation of Lysobacter species to such diverse environments. Abundant genes encoding proteases and peptidases in the genome of DS-58 may contribute to the discovery of effective and commercially useful proteolytic enzymes. Moreover, in the genome of DS-58, dozens of genes involved in the biosynthesis of type IV fimbriae were detected. The mechanism of gliding motility has not yet been clearly revealed, and we expect that the genome information of DS-58 may contribute to the genetic analysis of bacterial gliding motility.