High quality draft genome sequence of the moderately halophilic bacterium Pontibacillus yanchengensis Y32T and comparison among Pontibacillus genomes

Pontibacillus yanchengensis Y32T is an aerobic, motile, Gram-positive, endospore-forming, and moderately halophilic bacterium isolated from a salt field. In this study, we describe the features of P. yanchengensis strain Y32T together with a comparison with other four Pontibacillus genomes. The 4,281,464 bp high-quality-draft genome of strain Y32T is arranged into 153 contigs containing 3,965 protein-coding genes and 77 RNA encoding genes. The genome of strain Y32T possesses many genes related to its halophilic character, flagellar assembly and chemotaxis to support its survival in a salt-rich environment.

The Pontibacillus members are characterized as moderately halophilic, Gram-positive, aerobic, endosporeforming and rod-shaped bacteria. They are motile by peritrichous flagella and their DNA has a low G + C content. They are able to survive in salt-rich environments and grow optimally at 5-20 % NaCl (w/v) [9].
To adapt to saline environments, halophilic microorganisms have developed various biochemical strategies to maintain cell function, such as induction of Na + /H + antiporter systems and the production of compatible solutes. The compatible solutes are gaining increasing interest since they can be used as stabilizers, salt antagonists, or stress-protective agents [10][11][12][13]. In addition, a Pontibacillus strain could produce biosurfacants which is useful in degradation of paraffinic mixture or saline organic contamination [11].
In this study, we sequenced five Pontibacillus type strains, including P. yanchengensis Y32 T , P. chungwhensis BH030062 T , P. marinus BH030004 T , P. halophilus JSM076056 T and P. litoralis JSM072002 T (The GenBank accession summary of the strains is shown in Additional file 2). Here we present the draft genome sequence of P. yanchengensis Y32 T and compare it to the genomes of four other type strains. To the best of our knowledge, this is the first description of the Pontibacillus genome.

Organism information
Classification and features P. yanchengensis Y32 T was isolated from a salt field in Yancheng prefecture, on the east Yellow Sea in China. A taxonomic analysis was conducted based on the 16S rRNA gene sequence. The representative 16S rRNA gene sequences of the most closely related strains were downloaded from NCBI and multialigned by CLUSTAL W [14]. Phylogenetic consensus trees were constructed based on the aligned gene sequences using the neighbor-joining method with 1,000 bootstraps by using MEGA 6.0 [15]. The phylogenetic tree based on the 16S rRNA gene sequences indicated that strain Y32 T was clustered within a branch containing other species in the genus Pontibacillus (Fig. 1a).
Seventeen related strains of Bacillaceae [2] with complete genome sequences were chosen for further phylogenetic analysis, including the four draft-genome sequences of Pontibacillus that were sequenced by us. In total, 602 core protein sequences were extracted using the cluster algorithm tool OrthoMCL [16,17]  The NJ phylogenetic tree of P. yanchengensis Y32 T relative to 16 genome-sequenced strains from the Bacillaceae family was built based on the core protein sequences. All genome FASTA files were downloaded from NCBI except for the Pontibacillus genus. A total of 602 conserved proteins were identified using the cluster algorithm tool OrthoMCL [16,17]. The phylogenetic trees were constructed using the neighbor-joining method by MEGA 6.0 software [15] with a bootstrap value of 1,000 with default parameters. The neighbor-joining (NJ) phylogenetic tree showed that the five Pontibacillus species clustered into the same branch (Fig. 1b), which was in accordance with the 16S rRNA genebased phylogeny (Fig. 1a).
P. yanchengensis Y32 T is Gram-positive, rod-shaped (0.5-0.9 × 1.9-2.5 μm), motile with flagella ( Fig. 2) and endospore-forming. It can grow on Bacto marine broth 2216 (Difco) agar medium containing 3-20 % (w/v) NaCl and does not grow in the absence of NaCl [1]. The optimal growth temperature for Y32 T is 35-40°C ( Table 1). The strain is oxidase-and catalase-positive and negative for the production of H 2 S or indole. It has been reported to reduce nitrate to nitrite [1]. P. yanchengensis Y32 T can use a few kinds of sole carbon sources, including D-glucose, D-fructose, D-mannitol, D-maltose and D-trehalose [1]. Compared to the other Pontibacillus genus type strains, only P. yanchengensis Y32 T can utilize Dmannitol as sole carbon source [1]. KEGG pathway analysis of the five Pontibacillus genomes (see below) revealed that only strain Y32 T had the key enzyme mannitol-1-phosphate 5-dehydrogenase (gene ID: N782_14920) which could potentially catalyze Dmannitol 1-phosphate to D-fructose 6-phosphate. This result was consistent with the phenotype. As one of the most abundant polyols in nature, mannitol metabolism provides an important physiologic contribution in microbial stress responses [18].

Genome sequencing information
Genome project history P. yanchengensis Y32 T was selected for sequencing on the basis of its taxonomic representativeness, halophilic features and potential industrial applications. Genome sequencing was performed by Majorbio Biopharm Technology Co., Ltd., Shanghai, China. The draft genome sequence was deposited in NCBI with contigs larger than 200 bp. The GenBank accession number is AVBF00000000. A summary of the genome sequencing project information is shown in Table 2.
Growth conditions and DNA isolation P. yanchengensis Y32 T was grown aerobically in 50 mL Bacto marine broth 2216 (Difco) plus 5 % NaCl (w/v) at 37°C for 2 d with 150 rpm shaking. Cells were harvested by centrifugation and a pellet with an approximate wet weight of 20 mg was obtained. The genomic DNA was extracted using the QIAamp DNA kit according to the manufacturer's instructions (Qiagen, Germany). The quality and quantity of total DNA was determined using a NanoDrop Spectrophotometer 2000. Five micrograms of genomic DNA was sent to Majorbio (Shanghai, China) for sequencing on a Hiseq2000 (Illumina, CA) sequencer. The Illumina Hiseq2000 technology of Paired-End (PE) library with an average insert size of 300 bp was used to determine the sequence of P. yanchengensis Y32 T . A total of 4,083,912 × 2 high quality reads totaling 824,950,224 bp of data with an average coverage of 186.5 x was generated. Raw reads were filtered using a FastQC toolkit followed by assembly with SOAP denovo v1.05 and optimizing through local gap filling and base correction with Gap Closer.

Genome annotation
The draft genome sequence was deposited at NCBI and was annotated through the Prokaryotic Genome Annotation Pipeline, which combined the Best-Placed reference protein set and the gene caller GeneMarkS+. The WebMGA server was used to identify the Clusters of Ortholog Groups [19]. Transmembrane helices and signal peptides were predicted by the online bioinformatic tools TMHMM 2.0 [20,21] and SignalP 4.1 [22], respectively. Evidence codes -TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [31]   The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome

Genome properties
The final whole genome of P. yanchengensis Y32 T was 4,283,464 bp long, distributed in 153 contigs, and had an average GC content of 39.11 %. Of the total 4,080 predicted genes, 3,965 were proteincoding genes (CDSs), and 77 were RNA genes. A total of 2,615 CDSs (65.95 %) were assigned putative functions, and the remaining proteins were annotated as hypothetical proteins. The genome properties and statistics are summarized in Table 3. The distribution of genes into COGs functional categories is shown in Table 4.

Insights from the genome sequence
In this study, we compared the genome sequence of P. yanchengensis Y32 T with the genomes of P. chungwhensis BH030062 T , P. halophilus JSM076056 T , P. marinus BH030004 T and P. litoralis JSM072002 T . The general features of the five genomic sequences are summarized in Table 5. The results of the core genome analysis suggested that the five Pontibacillus species share 2,160 core genes, and P. yanchengensis Y32 T possesses 1,651 unique genes (Fig. 3a). Among the 1,651 unique genes for strain Y32 T , 1,154 unique genes were classified into 20 COG functional categories, which mainly belonged to the general function prediction group, the carbohydrate transport, the metabolism group and the function unknown group. The remaining 590 unique genes were not classified into any COG categories (Additional file 1: Table S1). The CG View Comparison Tool [23] was used to draw a comparison graphical circular map of the five Pontibacillus strains (Fig. 3b). All the Pontibacillus species were isolated from salty environments. They were characterized as moderately halophilic and cannot grow in the absence of NaCl. As moderate halophiles, effective establishment of ionic and osmotic equilibrium was important for survival in a saline environment. The genome comparison analysis showed that the five Pontibacillus strains possessed genes encoding cation/proton antiporter (e.g., Na + /H + antiporter, Na + /Ca 2+ antiporter), which played a role in tolerance to high concentrations of Na + , K + , Li + and/or alkali (Additional file 1: The percentage is based on the total number of protein-coding genes in the annotated genome  Table S2). Numerous studies showed that Na + /H + antiporters play important roles in the pH and Na + homeostasis of cells [24,25]. Meanwhile, the prediction of the membrane helices of the P. yanchengensis Y32 T genome suggested that nearly 30% of the genes had transmembrane helix structures (Table 3), which may be involved in ion transport.
Other than ion transport, the synthesis of compatible solutes (e.g., betaine, ectoine, amino acids) was beneficial for survival under extreme osmotic stress. Many   Fig. 3 Comparative genomic analysis of the genus Pontibacillus. a The flower plot shows the numbers of species-specific genes found in each genome of each species (in the petals) and the core orthologous gene number (in the center) of Pontibacillus. b Comparison map of strain P. yanchengensis Y32 T and the other four sequenced Pontibacillus strains. From outside to inside: rings 1, 4 show protein-coding genes colored by COG categories on the forward/reverse strand, respectively; rings 2, 3 represent genes on the forward/reverse strand, respectively; rings 5, 6, 7, 8 denote the CDS vs CDS BLAST results of P. marinus BH030004 T , P. chungwhensis BH030062 T , P. halophilus JSM076056 T , and P. litoralis JSM072002 T , respectively; ring 9 shows the GC skew compatible solute synthesis-related genes were identified in the genomes of the five Pontibacillus species (Additional file 1: Table S2). The Kyoto Encyclopedia of Genes and Genomes was used to reconstruct the glycine, serine and threonine metabolic pathways (Fig. 4). The metabolic pathways suggested that the five Pontibacillus strains could synthesize glycine as the main compatible solute. In addition, P. yanchengensis Y32 T , P. chungwhensis BH030062 T and P. marinus BH030004 T could synthesize betaine through the precursor choline. P. marinus BH030004 T also possessed the pathway of ectoine synthesis. These results indicated that the five Pontibacillus species use different strategies to cope with osmotic stress.
Many flagella-related genes were identified in the genomes of the five Pontibacillus species. Reconstruction of a multi-organism KEGG map suggested that the five Pontibacillus strains had intact chemotaxis systems (Fig. 5a) and flagella assembly-related genes (flg, fli and flh) (Fig. 5b). The moderately halophilic Pontibacillus strains were unable to grow with NaCl as the sole salt unless artificial seawater was added [1,[4][5][6][7][8]. Flagella and chemotaxis may play important roles in response to environmental salts. Fig. 4 The glycine, serine and threonine metabolic pathways of the five Pontibacillus strains (including P. yanchengensis Y32 T , P. marinus BH030004 T , P. chungwhensis BH030062 T , P. halophilus JSM076056 T , and P. litoralis JSM072002 T ) reconstructed by KEGG. The green box represents the enzyme shared by all five strains to synthesize glycine. The blue boxes denote the enzymes involved in betain synthesis, which were found in P. yanchengensis Y32 T , P. chungwhensis BH030062 T and P. marinus BH030004 T . The pathway with pink boxes is found only by P. marinus BH030004 T and is related to ectoine synthesis