Genomic information of the arsenic-resistant bacterium Lysobacter arseniciresistens type strain ZS79T and comparison of Lysobacter draft genomes

Lysobacter arseniciresistens ZS79T is a highly arsenic-resistant,rod-shaped, motile, non-spore-forming, aerobic, Gram-negative bacterium. In this study, four Lysobacter type strains were sequenced and the genomic information of L. arseniciresistens ZS79T and the comparative genomics results of the Lysobacter strains were described. The draft genome sequence of the strain ZS79T consists of 3,086,721 bp and is distributed in 109 contigs. It has a G+C content of 69.5 % and contains 2,363 protein-coding genes including eight arsenic resistant genes.

So far, the genomic sequences of two Lysobacter strains have been published (Lysobacter capsici AZ78 [8,9] and Lysobacter antibioticus 13-6 [10]), but the annotation of L. antibioticus 13-6 was not completed. In order to provide genome information of genus Lysobacter, we performed whole genome sequencing of four strains of Lysobacter (L. arseniciresistens ZS79 T , Lysobacter conceretionis Ko07 T [5], Lysobacter daejeonensis GH1-9 T [11], and Lysobacter defluvii IMMIB APB-9 T [12]). In this study, the genome features of L. arseniciresistens ZS79 T is provided and the comparative results of five genomes of Lysobacter are presented. performed based on 16S rRNA genes (Fig. 1a) and 831 conserved proteins (Fig. 1b). In both trees, strain ZS79 T is clustered with the other four strains of genus Lysobacter. The phylogenies of the two trees are similar but genomic based tree is more stable than the 16S rRNA gene one ( Fig. 1b vs 1a).
L. arseniciresistens ZS79 T is aerobic, motile, and Gram-negative bacterium with a Minimum Inhibitory Concentration of 14 mM arsenite in R2A medium ( Table 1). The cells are rod-shaped with one flagellum and non-spore-forming (Fig. 2). Colonies of this strain are yellow, nontransparent, convex, circular, and, smooth [1].

Genome project history
The genome of L. arseniciresistens ZS79 T was sequenced in April, 2013 and finished within two months. The high-quality draft genome sequence is available in Gen-Bank database under accession number AVPT00000000.
The genome sequencing project information is summarized in Table 2.
Growth conditions and genomic DNA preparation L. arseniciresistens ZS79 T was cultured in 50 ml of LB (Luria-Bertani) medium at 28°C for 3 days with 160 160 r/min shaking. About 10 mg cells were harvested by centrifugation and suspended in normal saline, and then lysed using lysozyme. DNA was isolated using cells were harvested by centrifugation and suspended in normal saline, and then lysed using lysozyme. The DNA was extracted and purified using the QiAamp kit according to the manufacturer's instruction (Qiagen, Germany).

Genome sequencing and assembly
The whole genome sequencing of L. arseniciresistens ZS79 T was performed on Illumina Hiseq2000 with Paired-End library strategy (300 bp insert size) at Majorbio Biomedical Science and Technology Co. Ltd. DNA libraries with insert sizes from 300 to 500 bp was constructed using the established protocol [13]. The obtained high quality data contains 4,528,542 × 2 pared reads and 194,996 single reads with an average read length of 91 bp. The sequencing depth was 272.6×. Using SOAPdenovo v1.05 [14] the reads were assembled The NJ tree based on 831 conserved proteins among the ten Xanthomonadaceae strains. Phylogenetic analyses were performed using MEGA version 6 [33]. The trees were built using p-distance model and a bootstrap analysis of 1000 replicates. The GenBank numbers are listed after each strain

Genome annotation
The draft sequence of L. arseniciresistens ZS79 T was annotated using the National Center for Biotechnology Information Prokaryotic Genomes Annotation Pipeline [15]. The functions of the predicted genes were determined through blast alignment against the NCBI protein database. Genes were identified using the gene caller GeneMarkS + with the similarity-based gene detection approach [16]. The different features were predicted by WebMGA [17], TMHMM [18] and SignalP [19].

Genome properties
The whole genome sequence of L. arseniciresistens ZS79 T is 3,086,721 bp long with a G+C content of 69.6 % and is distributed into 109 contigs. It has 2,422 predicted genes including 2,363 (97.6 %) protein coding genes, 50 (2.1 %) RNA genes, and 9 (0.4 %) pseudo   [20]. More detailed information of the genome statistics is showed in Table 3. The protein functional classification according to COGs is showed in Table 4. The genome map is showed in Fig. 3.

Insights from the genome sequences
To obtain features of Lysobacter genomes, we sequenced four genomes of genus Lysobacter and performed comparative genomic analysis among the five available genomes of this genus. The general features of these five genomes are summarized in Table 5. To calculate the pan-genome and core-genome of these five genomes, we performed orthologs clustering analysis using OrthoMCL [21]. The pan-genome has 6,409 orthologs families and the core-genome has 1,207 orthologs. The numbers of unique genes of each genome are showed in Fig. 4. To evaluate the genome variation of these five genomes, we first performed multiple alignments among these genome sequences using MAUVE [22] and then calculated the nucleotide diversity using DnaSP v5 [23]. These five genomes shared 0.73 Mb co-linear sequences. The π value of these sequences among these five genomes is 0.173 which means that the approximate nucleotide sequence homology is 83 % among genomes of Lysobacter [23]. Fig. 3 Graphical circular map of L. arseniciresistens ZS79 T genome. From outer to inner, ring 1 shows the genomic islands (red bars) that were predicted by IslandViewer [34]; ring 3,4 show the predicted genes on forward/reverse strand; ring 2,5 show the genes assigned to COGs; ring [6][7][8][9] show the ORFs similarity between the genome of L. arseniciresistens ZS79 T and the genomes of L. conceretionis Ko07 T , L. daejeonensis GH1-9 T , L. capsici AZ78 and L. defluvii IMMIB APB-9 T ; ring 10 shows the G+C% content plot In the genome of L. arseniciresistens ZS79 T , we found that the genomic island distributions are consistent with the genome C + G content anomaly areas (Fig. 3). In addition, few gene sequences from the other four Lysobacter genomes could be aligned with these genomic island regions (Fig. 3, ring 6 to ring 9). These results indicated that the genes within the genomic islands were most probably acquired by horizontal transfer [24] and these regions are unique in the genome of L. arseniciresistens ZS79 T .
According to Kyoto Encyclopedia of Genes and Genomes [25] annotation result, all of the five Lysobacter genomes have a nearly complete type II secretion system which could secret cell wall degrading enzymes [26]. This result may correspond to the behavior of Lysobacter members that were able to lyse cells of many microorganisms [3]. In addition, the genomes of L. arseniciresistens ZS79 T , L. concretionis Ko07 T and L. defluvii IMMIB APB-9 T contain genes for flagellar assembly, whereas the genome of L. daejeonensis GH1-9 T does not contain any genes for flagellar assembly and L. capsici AZ78 does not contain genes for flagellar filament (Additional file 1: Table S2). These genotypes correspond to the phenotype descriptions that L. daejeonensis and L. capsici are non-motile [8,11].
Genomic analysis showed eight genes corresponding to arsenic resistance in the genomes of L. arseniciresistens ZS79 T (Additional file 1: Table S3). This result well explained the arsenite resistance of this strain [1]. By contrast, fewer arsenic resistance were found in the genomes of L. concretionis Ko07 T , L. defluvii IMMIB APB-9 T , L. capsici AZ78, and L. daejeonensis GH1-9 T compared to strain ZS79 T .

Conclusions
The genomic information of L. arseniciresistens ZS79 T and the comparative genomics analysis of the five Lysobacter strains are obtained. The genomic based phylogeny is in agreement with the 16S rRNA gene based one indicating the usefulness of genomic information for The genome of L. arseniciresistens ZS79 T , L. conceretionis Ko07 T , L. daejeonensis GH1-9 T and L. defluvii IMMIB APB-9 T are sequenced in this study. The genome of L. capsici AZ78 was sequenced by Puoplo et al. [9] Fig. 4 The core-genome and the unique genes of the five Lysobacter genomes. The Venn diagram shows the number of orthologous gene families of the core-genome (in the center) and the numbers of unique genes of each genome bacterial taxonomic classification. Analysis of the genomes show certain correlation between the genotypes and the phenotypes.