Draft genome sequence of Sphingomonas paucimobilis strain LCT-SP1 isolated from the Shenzhou X spacecraft of China

Sphingomonas paucimobilis strain LCT-SP1 is a glucose-nonfermenting Gram-negative, chemoheterotrophic, strictly aerobic bacterium. The major feature of strain LCT-SP1, isolated from the Chinese spacecraft Shenzhou X, together with the genome draft and annotation are described in this paper. The total size of strain LCT-SP1 is 4,302,226 bp with 3,864 protein-coding and 50 RNA genes. The information gained from its sequence is potentially relevant to the elucidation of microbially mediated corrosion of various materials.


Introduction
Sphingomonas paucimobilis strain LCT-SP1 is a glucosenonfermenting Gram-negative, chemoheterotrophic, strictly aerobic bacterium [1]. LCT-SP1, based on 16S rRNA gene sequences, is most closely related to Sphingomonas haloaromaticamans, which is isolated from water and soil. Several studies suggest that S. paucimobilis can degrade many compounds or materials, such as ferulic acid [2], lignin [3], and biphenyl [4]. LCT-SP1 was isolated from the condensate water in the Chinese spacecraft Shenzhou X.
LCT-SP1 can corrode numerous materials including epoxy resin, ester polyurethane, and ethers polyurethane. Therefore, the strain may be a suitable model for examining the properties of genes involved in microbial corrosion of materials used in aerospace applications. This study mainly aims to describe the draft genome of S. paucimobilis strain LCT-SP1 together with the genomic sequencing and annotation, which may be helpful in investigating the possible mechanisms in the microbial corrosion of materials.

Classification and features
A phylogenetic tree was constructed with MEGA 5 [5] along with the sequences of representative members of the genus Sphingomonas using the maximum likelihood method based on 16S rRNA gene phylogeny (Fig. 1). Figure 1 shows that LCT-SP1 is most closely related to Sphingomonas sp. DSM 30198 (HF558376), G1Bc9 (KF465966), SKJH-30 (AY749436), and G3Cc10 (KF465968), with a sequence similarity of 100 % based on BLAST analysis. In addition, considering that the ANI is an important index in terms of phylogenetic analysis [6], the ANIs between LCT-SP1 and Sphingomonas paucimobilis NBRC 13935 were also calculated. The ANI result was 99.68 %, which is greater than 95 % (the species ANI cutoff value). Therefore, LCT-SP1 is asssumed to belongs to the species of Sphingomonas paucimobilis.

Genome sequencing information
Genome project history A summary of the main project information of the S. paucimobilis strain LCT-SP1 is shown in Table 2. This organism was isolated from the condensate water in the Shenzhou X spacecraft, and was selected for sequencing for its phylogenetic affiliation with a lineage of S. paucimobilis. The genome sequences of this organism were deposited in GenBank under accession number KR080483, which belongs to the 16s ribosomal RNA coding gene sequence of LCT-SP1.
Growth conditions and genomic DNA preparation S. paucimobilis strain LCT-SP1 was grown overnight on an aerobic LB agar plate at 35°C. The total genomic DNA was extracted from 20 mL of cells using a CTAB bacterial genomic DNA isolation method [7] with kits provided by Illumina Inc. according to the manufacturer's instructions. DNA quality and quantity was determined by spectrophotometry.

Genome sequencing and assembly
The genome of LCT-SP1 was sequenced using pairedend sequencing technology [8] with Illumina HiSeq2000 (Illumina, SanDiego, CA, USA) at Majorbio Bio-pharm Technology Co., Ltd. (Shanghai, China). Draft assemblies were based on 6,986,766 readings, totaling 1,754 Mbp of 300 bp the PCR-free library, and 3,442,511 readings, totaling 1,556 Mbp of the 6,000 bp index library.
The assembly was performed using the SOAPdenovo software package version 1.05 [9]. The gaps among scaffolds were closed by custom primer walks or by PCR amplification, followed by DNA sequencing to achieve optimal assembly results. The genome contained 3,884 candidate protein-encoding genes (with an average size of 958 bp), giving a coding intensity of 87.7%. A total of 1,906 proteins were assigned to 25 COG families [10]. A total of 47 tRNA genes and 3 rRNA genes were identified.

Genome annotation
Protein-coding genes of the draft genome assemblies were established using Glimmer version 3.0 [11]. The predicted CDSs were translated and employed to search the KEGG, COG, String, NR, and GO databases. These data sources were brought together to assert a product description for each predicted protein. tRNAs and rRNAs were predicted using tRNAscan-SE [12] and RNAmmer [13], respectively. Automatic gene annotation was performed by the National Center for Biotechnology Information Prokaryotic Genomes Automatic Annotation Pipeline [14].

Genome properties
The LCT-SP1 genome consisted of 4,302,226 bp circular chromosomes with a GC content of 65.66 % (Table 3). Of the 3,934 predicted genes, 3,884 (98.73 %) were protein-coding genes, and 50 (1.27 %) were RNA genes (3 rRNA genes, and 47 tRNA genes). In addition, among the total predicted genes, 1,906 (48.45 %) represented COG functional categories. Of these, the most abundant COG category was "General function prediction only" (211 proteins) followed by "Amino acid transport and metabolism" (171 proteins), "Translation" (141 proteins),   Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [32] "Energy production and conversion" (140 proteins), "Replication, recombination and repair" (130 proteins), "Function unknown" (124 proteins), "Inorganic ion transport and metabolism" (210 proteins), and "Replication, recombination and repair" (201 proteins). The properties and statistics of the genome are summarized in Table 3. The draft genome map of S. paucimobilis strain LCT-SP1 is illustrated in Fig. 3, and the distribution of genes into COG functional categories is presented in Table 4.

Insights from the genome sequence
Several studies suggest that the genus S. paucimobilis can degrade many compounds or materials, such as ferulic acid [2], lignin [3], and biphenyl [4]. Arens et al. believed that the localized corrosion of copper cold-water pipes resulted from the genus Sphingomonas, leading to surface erosions, covered tubercles, and through-wall pinhole pits on the inner surface of the pipe [15]. S. paucimobilis strain LCT-SP1 can corrode several materials including epoxy resin, ester polyurethane, and ethers polyurethane (unpublished data). LCT-SP1 was isolated from the condensation water in the Chinese spacecraft Shenzhou X. Therefore, LCT-SP1 could be a suitable model for studying the properties of genes involved in microbial corrosion of aerospace related materials.
Additionally, EC 1.14.11.2, gloA, and arsC gene were present in LCT-SP1, which was identified with 100% similarity to Sphingomonas sp. S17 [16]. EC 1.14.11.2 is categorized as a procollagen-proline catalyzing enzyme [17]. The gloA gene encodes a glyoxalase that can reduce methylglyoxal toxicity in a cell [18]. Furthermore, arsC gene produces an arsenate reductase that can convert arsenate into arsenite, which is accordingly exported from cells by an energy-dependent efflux process [19]. Therefore, the genes mentioned above are likely responsible for the ability of LCT-SP1 to degrade various recalcitrant aromatic compounds and polysaccharides.
The LCT-SP1 genome also contained an NhaA-type CDS for the Na + /H + antiporter and some subunits of the multisubunit cation antiporter (Na + /H + ) [20], which suggested that this strain should be compatible with its alkaline and hypersaline environment, and could corrode metallic materials by changing the pH balance of their surface.
Also, biofilms from bacteria may be beneficial for corrosion control because of the removal of corrosive agents and the generation of a protective layer by biofilms [21]. LCT-SP1 included the gene encoding biofilm dispersion protein BdlA and biofilm growth-associated repressor that could inhibit the formation of biofilm, which may explain the microbial corrosion of materials. Further studies are needed to investigate these corrosion-based gene-coding sequences to reveal the role of LCT-SP1 in the microbial corrosion of materials.

Conclusions
The genome of S. paucimobilis strain LCT-SP1 isolated from the condensate water in the Chinese spacecraft Shenzhou X was sequenced. The strain LCT-SP1 genome included numerous genes that are likely responsible for their ability to degrade various recalcitrant aromatic compounds and polysaccharides. Further study of these corrosion-based gene-coding sequences may reveal the role of S. paucimobilis LCT-SP1 in microbial corrosion   [33]. DNAPlotter reads the common sequence formats (EMBL, Genbank, GFF) using the Artemis file-reading library and displays the sequence as the circular plot. Additional feature files can be read in and overlaid on the sequence