Genome sequence of the chromate-resistant bacterium Leucobacter salsicius type strain M1-8T

Leucobacter salsicius M1-8T is a member of the Microbacteriaceae family within the class Actinomycetales. This strain is a Gram-positive, rod-shaped bacterium and was previously isolated from a Korean fermented food. Most members of the genus Leucobacter are chromate-resistant and this feature could be exploited in biotechnological applications. However, the genus Leucobacter is poorly characterized at the genome level, despite its potential importance. Thus, the present study determined the features of Leucobacter salsicius M1-8T, as well as its genome sequence and annotation. The genome comprised 3,185,418 bp with a G+C content of 64.5%, which included 2,865 protein-coding genes and 68 RNA genes. This strain possessed two predicted genes associated with chromate resistance, which might facilitate its growth in heavy metal-rich environments.

L. salsicius strain M1-8 T has lower chromate resistance than L. chromiiresistens but it still exhibits moderate resistance (up to 10.0 mM Cr(VI)). Thus, the genomic analysis of L. salsicius M1-8 T should help us to understand the molecular basis of adaptation to a chromium-contaminated environment. The present study determined the classification and features of Leucobacter salsicius strain M1-8 T , as well as its genome sequence and gene annotations.
The multiple sequence alignment program CLUSTALW [18] was used to align the 16S rRNA gene sequences from M1-8 T and related taxa. Phylogenetic trees were constructed based on the aligned gene sequences using the maximumlikelihood, maximum-parsimony, and neighborjoining methods based on 1,000 randomly selected bootstrap replicates using MEGA version 5 [19].
Strain M1-8 T shared 99.1% nucleotide sequence similarity with L. aerolatus Sj10 T , the closest validated Leucobacter species according to the phylogeny ( Figure 1). Figure 1 shows the phylogenetic position of L. salsicius in the 16S rRNA-based tree. The sequence of the single 16S rRNA gene copy found in the genome did not differ from the previously published 16S rRNA sequence (GQ352403).  using Glaciibacter superstes AHU1791 T as the outg roup. The sequences were alig ned using CLUSTALW [18] and the phylog enetic tree was inferred from 1, 390 alig ned characteristics of the 16S rRNA g ene sequence using the maximum-likelihood (ML) alg orithm [20] with MEGA5 [19]. The branches are scaled in terms of the expected number of substitutions per site. The numbers adjacent to the branches are the support values based on 1,000 ML bootstrap replicates [20] (left), 1,000 maximum-parsimony bootstrap replicates [21] (middle), and 1,000 neig hbor-joining bootstrap replicates [22] (rig ht), for values >50%.

Morphology and physiology
Strain M1-8 T is classified as class Actinobacteria, order Actinomycetales, family Microbacteriaceae, genus Leucobacter (Table 1) [1]. The strain L. salsicius M1-8 T was isolated from a Korean saltfermented food that contains tiny shrimp (shrimp jeotgal). The cells of strain M1-8 T were rodshaped, 1.0-1.5 μm in length, and 0.4-0.5 μm in diameter ( Figure 2). No flagella were observed. The colonies were cream in color and circular with entire margins on marine agar medium. Strain M1-8 T was aerobic and Gram-positive (Table 1). Optimum growth was observed at 25-30°C, at pH 7.0-8.0, and in the presence of 0-4% (w/v) NaCl. The tolerance of Cr (VI) was observed at up to 10.0 mM K2CrO4. The physiological characteristics, such as the growth substrates of M1-8 T , were described in detail in a previous study [1]. The evidence codes are as follows. TAS: traceable author statement (i.e., a direct report exists in the literature). NAS: non-traceable author statement (i.e., not observed directly in a living , isolated sample, but based on a g enerally accepted property of the species, or anecdotal evidence). These evidence codes are derived from the Gene Ontol og y project [32]. Standards in Genomic Sciences

Genome sequencing and annotation
Genome project history L. salsicius strain M1-8 T was selected for genome sequencing based on its environmental potential and is part of the Next-Generation BioGreen 21 Program (No.PJ008208). The genome sequence was deposited in DDBJ/EMBL/GenBank under accession number AOCN00000000 and the genome project was deposited in the Genomes On Line Database [33] under Gi21829. The sequencing and annotation were performed by ChunLab Inc., South Korea. A summary of the project information and the associations with "Minimum Information about a Genome Sequence" (MIGS) [34] are shown in Table 2.

Growth conditions and DNA isolation
L. salsicius strain M1-8 T was cultured aerobically in marine agar medium at 30°C. Genomic DNA was extracted using a G-spin DNA extraction kit (iNtRON Biotechnology), according to the standard protocol recommended by the manufacturer.

Genome sequencing and assembly
The genome was sequenced using a combination of an Illumina Hiseq system with a 150 base pair (bp) paired-end library, a 454 Genome Sequencer FLX Titanium system (Roche) with an 8 kb pairedend library, and a PacBio RS system (Pacific Biosciences). The Illumina reads were assembled using CLC Genomics Workbench ver. 5.0. The initial assembly was converted for the CLC Genomics Workbench by constructing fake reads from the consensus to collect the read pairs in the Illumina paired-end library. The 454 paired-end reads were assembled with Illumina data using gsAssembler ver. 2.6 (Roche) and the PacBio sequences were clustered into overlapping assembled data. CodonCode Aligner and CLC Genomics Workbench 5.0 were used for sequence assembly and quality assessment in the subsequent finishing process.

Genome annotation
The genes in the assembled genome were predicted using Integrated Microbial Genomes -Expert Review (IMG-ER) platform as part of the DOE-JGI genome annotation pipeline [35], followed by a round of manual curation using the JGI GenePRIMP pipeline. Comparisons of the predicted ORFs using the SEED [36], NCBI COG [37], Ez-Taxon-e [38], and Pfam [39] databases were conducted during gene annotation. Additional gene prediction analyses and functional annotation were performed with the Rapid Annotation using Subsystem Technology (RAST) server databases [40] and the gene-caller GLIMMER 3.02. RNAmer 1.2 [41] and tRNAscan-SE 1.23 [42] were used to identify rRNA genes and tRNA genes, respectively. The CLgenomics TM 1.06 (ChunLab) was used to visualize the genomic features.

Genome properties
The genome comprised a circular chromosome with a length of 3,185,418 bp and a G+C content of 64.5% (Figure 3 and Table 3). Of the 2,933 predicted genes, 2,865 were protein-coding genes and 68 were RNA genes (three 5S rRNA genes, three 16S rRNA genes, three 23S rRNA genes, 51 predicted tRNA genes, and eight miscRNA genes). The majority of the protein-coding genes (2,275 genes; 77.6%) was assigned putative functions, while the remainder was annotated as hypothetical proteins (182 genes). The genome properties and statistics are summarized in Table 3. The distributions of genes among the COGs functional categories are shown in Table 4.   The total is based on the total number of protein-coding genes in the annotated g enome.

Insights from the genome sequence
Leucobacter salsicius M1-8 T and Leucobacter members, such as L. chromiireducens, L. aridicollis, L. luti, and L. alluvii, have been shown to possess chromate resistance in previous studies, while Zhu et al. reported the reduction of chromate by Leucobacter sp [43]. In the present study, the genome analysis of Leucobacter salsicius M1-8 T detected two copies of chromate transport protein A (ChrA), which is a membrane protein that confers heavy metal tolerance via chromate ion efflux from the cytoplasm. Potentially, this gene is a key feature that allows Leucobacter to adapt to chromate-contaminated environments. The genome sequence of L. salsicius M1-8 T should provide deeper insights into the molecular mechanisms that underlie chromium tolerance and it may facilitate the development of biotechnological applications to improve chromium-contaminated field sites.