Genomic data resource of type strains of genus Pseudoxanthomonas

Genus Pseudoxanthomonas represents a relatively newly characterized group of gamma-proteobacterium of environmental origin. Species of the genus have very similar morphology to strains belonging to Xanthomonas, Xylella and Stenotrophomonas. However, the genome resource of this genus was largely unexplored. The species belonging to the genus are from a wide range of environmental sites including hydrocarbon polluted fields. Here, we have provided the whole genome sequence of all available type strains of the genus of Pseudoxanthomonas. In order to deduce the differences with closely related genera, we have employed the whole genome-based investigation of the type species of genus Pseudoxanthomonas.


Specifications
Biological sciences Specific subject area Microbiology: Bacteriology Type of data Whole genome sequence assembled genome with gene annotation and phylogeny of genus Pseudoxanthomonas and its related genera. How the data were acquired Whole genome sequencing (WGS) library was prepared for Illumina MiSeq sequencing platform. Assembly of the raw reads were performed using SPAdes v3. 10. Data format Raw Analyzed Parameters for data collection Sequencing library for all the available type strains were prepared for Illumina MiSeq following manufacturer's instructions. Sequencing was performed with 2 * 250 bp paired end sequencing kit. Description of data collection WGS data obtained from the sequencer was quality trimmed by control software of Illumina MiSeq. Raw reads were de novo assembled into high quality draft genome was performed using SPAdes v3.10 and quality checked using CheckM v1.

Value of the Data
• Species of genus Pseudoxanthomonas are from contaminated sites such as: heavy metal, oil, hydrocarbons etc. Genome resource of strains from such extreme environmental conditions will aid in identification of genomic signatures underlying their bioremediation potential. • These assembled genomes can be reused as a reference by the taxonomist and microbiologist in order to distinguish any putative species of the genera Pseudoxanthomonas . • Present genome resource of type strains will be valuable in addressing the taxonomic ambiguities of the family Lysobacteraceae and order Lysobacterales .

Data Description
Here, we have performed whole-genome sequencing of the 15 type strains of genus Pseudoxanthomonas comprising of 14 valid species and one non-valid type strain of P. jiangsuensis DSM 22398 T based on LPSN latest classification v2.0. Whole genome data of P. dokdonesis DSM 21858 T , P. indica P15 T and P. spadix BD-a59 were obtained from the public repository of NCBI ( Table 1 ). P. helianthi NRBC 110414 T [2] and P. putridarboris LMG 25968 T [3] could not retrieved and thus whole genome sequence information is not included in the study. 16S rRNA  based phylogeny of all the twenty species of the genus Pseudoxanthomonas is depicted in Fig. 1 .
Whole genome sequence of the type strains of the genus Pseudoxanthomonas can be a valuable resource in taxonogenomics study of family Lysobacteraceae and its close relatives such as Xanthomonas and Stenotrophomonas [4 , 5] . Extreme environmental isolates such as P. taiwanensis [6] could be one of the key biotechnologically importance species to explore the heat stress mechanism. Genome resource of species of P. broegbernensis, P. indica, P. kalamensis, P. kaohsiungensis, P. sacheonensis, P. spadix and P. jiangsuensis [7][8][9][10][11][12][13] could be used for studying the stress tolerant genomics determinants.

Bacterial strains and culture conditions
Type strains of the genus Pseudoxanthomonas were procured from two culture collection of Korean Collection for Type Cultures (KCTC) and The Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures ( Table 1 ). Ampoules containing respective bacterial cultures were processed in the recommended media and condition in accordance with the bacterial strains collection.

Genome sequencing, assembly and annotation
Bacterial genomic DNA was extracted using ZR Fungal/Bacterial DNA MiniPrep Kit (Zymo Research, Irvine, CA, USA) and quantified using Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). 1 ng of DNA sample was used in the preparation of Illumina sequencing libraries using Nextera XT sample preparation kit with dual indexing following provider's instructions. Sequencing libraries were pooled and sequenced in-house on Illumina MiSeq platform with 2 * 250 bp paired-end sequencing kit.
The raw sequencing reads were assembled into the high-quality draft genome using SPAdes v3.10 [14] which is a de Bruijn graph-based assembler for the bacterial genome. Quality of the assembled genome was accessed using QUAST v4.4 [15] and overall coverage of the assembled genome was calculated using BBMap [16] . Presence of putative plasmid in the assembled genome was accessed using plasmidSPAdes [17] with a minimum cut-off of 1Kb length. The assembled genomes were annotated using the NCBI prokaryotic genome annotation pipeline [18] . Assembly information with the putative number of plasmids is summarized in Table 1 .

Phylogenetic assessment
Phylogenetic analysis based on the traditional 16S rRNA gene sequence was performed, for which 16S rRNA gene sequence was fetched from the respective assembled genome using from a standalone academic version of RNAmmer v1.2 [19] except for the type strains for species P. spadix, P. helanthi and P. putridarboris. 16S rRNA for these 3 species were taken from LPSN of the respective species definition. Multiple sequence alignment of 16S rRNA gene sequences was performed using ClustalW [20] . Phylogenetic tree based on Maximum Likelihood method with 10 0 0 bootstrap replication was generated using MEGA v7.0.18 [21] .

Ethics Statement
There is no ethical concern involved in the study.

Author Contributions
SK, SS and PPP have carried out strain procurement from culture collection and strain revival. KB and SK have performed whole genome sequencing and submission of assembled genomes to NCBI. PBP has conceived the study and participated in the design. All the authors have read and approved the manuscript.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.

Data Availability
Genomic data resource of type strains of genus Pseudoxanthomonas (Original data) (NCBI genome).