Draft genome sequence data of a 4-nitrophenol- degrading bacterium, Pseudomonas alloputida strain PNP

A 4-nitrophenol-degrading bacterial strain PNP was isolated from pesticide-contaminated soil collected from Lucknow. Strain PNP utilized 0.5 mM 4-nitrophenol as its carbon source and degraded it completely within 24 h with stoichiometric release of nitrite ions. Strain PNP was associated with the genus Pseudomonas in a phylogentic tree and exhibited highest 16S rRNA gene sequence similarity to Pseudomonas juntendi BML3 (99.79%) and Pseudomonas inefficax JV551A3 (99.79%). Based on values of average nucleotide identity and digital DNA-DNA hybridization among strain PNP and its closely related type strains, it concluded that strain PNP belongs to Pseudomonas alloputida. The Illumina HiSeq platform was used to sequence the PNP genome. The draft genome sequence of Pseudomonas alloputida PNP was presented here. The total size of the draft assembly was 6,087,340 bp, distributed into 87 contigs with N50 value of 139502. The genome has an average GC content of 61.7% and contains 5461 coding sequences and 77 putative RNA genes. This Whole Genome Shotgun project has been submitted at DDBJ/ENA/GenBank under the accession JAGKJH000000000.

Microbiology Specific subject area Environmnetal Microbiology Type of data Data were presented in FASTA format, figures, and tables How data were acquired Illumina HiSeq system was used to generate genome sequence data Data format Raw, analysed and assembled genome sequences Parameters for data collection A pure culture of Pseudomonas alloputida PNP was obtained and cultivated and its DNA was isolated and sequenced. Description of data collection Genome sequencing, assembly, and annotation. Genome sequencing was performed using HiSeq platform and the Unicycler v0.4.8 was used for initial assembly. Annotation was performed using the NCBI Prokaryotic Genome Automatic Annotation Pipeline and the RAST server. Data

Value of the Data
• The Pseudomonas alloputida PNP genome sequence could reveal important details on degradation of 4-nitrophenol and other xenobiotics. • The data could be useful for reserchers working on biodegradation and bioremediation of various aromatic compounds. • This genome information could be useful for comparative genomic research of Pseudomonas strains with biodegradation capability.

Data Description
Pseudomonas alloputida PNP was isolated from the pesticide-contaminated soil collected from Lucknow, India. Strain PNP utilized 4-nitrophenol as its carbon source, totally degrading it in 24 hours and releasing stoichiometric levels of nitrite ions. Table 1 summarizes genomic characteristics of Pseudomonas alloputida PNP. The assembled genome of Pseudomonas alloputida PNP contained 87 contigs with a total length of 6,087,340 bp and N50 value of 139,502. The G + C content of genome was 61.7%. The NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) predicted a total of 5635 genes, 5461 of which were associated with coding specific proteins while 77 and 97 of which were responsible for coding RNA genes (69 tRNAs, 5 ncRNAs, 3 16S-23S-5S rRNAs) and pseudogenes, respectively. Fig. 1 shows a circular map of the Pseudomonas alloputida PNP genome.
The annotation of the Pseudomonas alloputida PNP genome using RAST server predicted a total of 5726 coding sequences which were categorized into 371 subsystems with 28% subsystem coverage ( Fig. 2 ). Subsystem category belonging to amino acids and derivatives contained highest number of genes (471) followed by carbohydrates (257 ), protein metabolism (215) cofactors, vitamins, prosthetic group, and pigments (196 ), respiration (125), membrane transport (113), stress response (109) and fatty acids, lipids, and isoprenoids (104). Subsystem category "metabolism of aromatic compounds" conatnined 80 genes associated with degradation of benzoate, 4-hydroxybenzoate, quinate, n-phenylalkanoic acid, gentisate, homogentisate, catechol, and protocatechuate. Furthermore, we have also identified genes responsible for hydroxyquinol 1,2-dioxygenase and maleylacetate reductase, which are involved in the lower route of degradation pathway of 4-nitrophenol in Gram-negative bacteria [1] . Furthermore genes involved in bioremediation of chromium and arsenic such chromate efflux transporter, AraC family transcriptional regulator, arsenate reductase ArsC, metalloregulator ArsR/SmtB family transcription factor, arsenical resistance protein ArsH, organoarsenical effux MFS transporter ArsJ, arsenical efflux pump membrane protein ArsB were also detected. Based on annotated data, Phylogenetic analysis based on 16S rRNA gene sequences of strain PNP and its closely relative strains showed that strain PNP fell within same clade with Pseudomonas alloputida Kh7 ( Fig. 3 ). Additinally, whole-genome comparisons, using average nucleotide identity and digital DNA-DNA hybridization tests, indicated that strain PNP belongs to Pseudomonas alloputida . Table 2 shows that average nucleotide identity and digital DNA-DNA values amongs strain PNP and closest reference type strains. The average nucleotide identity and digital DNA-DNA values between strain PNP and Pseudomonas alloputida Kh7 were 97.34% and 77.90% respectvely. These values were higher than the suggested threshold values for the species delineation (95-96% for ANI and 70% for DDH). Therefore, strain PNP was a new strain of Pseudomonas alloputida.

Sample collection and bacterial isolation
A 4-nitrophenol-mineralizing bacterial strain PNP was isolated from the pesticide contaminated soil collected from Lucknow, India by an enrichment method. For enrichment, 1 g of the collected soil sample was suspended into 10 0 0 ml Erlenmeyer flask containing 250 ml minimal media and 0.5 mM 4-nitrophenol as the sole source of carbon and energy. The media colour was yellow due to the presence of 4-nitrophenol. The flask was incubated at at 30 °C till decolourization of yellow colour of 4-nitrophenol. After decolourization, culture media was serially diluted and plated on minimal media agar plates containing 0.5 mM 4-nitrophenol. The plates were then incubated at 30 °C for 72 h. One bacterial strain designated PNP was selected due to its potential to degrade and decolourize 4-nitrophenol. This strain was preserved in 10 % glycerol vial at -80 °C and used for this study.

Bacterial growth, 4-nitrophenol degradation and nitrite release
Strain PNP was grown on 500 ml Erlenmeyer flask containing 100 ml minimal media contanining 0.5 mM 4-nitrophenol as its sole source of carbon and energy. The flask was incubated at 30 °C under shaking conditions. Samples were collected at regular intervals to monitor bacterial growth, 4-nitrophenol degradation and nitrite release. The bacterial growth was monitored by taking aborbance at 600 nm using spectrophotometer. For nitrophenol degradation, samples were cetrifuged and the degradation was monitored by taking optical density of supernatant at 420 nm. The nitrite release was monitored by a colourimetric method as described previously [2] .

Bacterial DNA isolation
Strain PNP was cultured at 30 °C on nutrient agar plates. Under shaking conditions (180 rpm), a single colony of strain PNP was cultivated overnight in Nutrient broth. The pellet from the centrifuged bacterial culture was used to harvest DNA. The DNAminikit (Qiagen, Germantown, MD, USA) was used to extract genomic DNA according to the manufacturer's instructions.

Whole genome sequencing, assembly and annotation
Following the manufacturer's instructions, a whole-genome sequencing library was created using the Nextera XT DNA library preparation kit.The HiSeq platform (Illumina, San Diego, CA, USA was used to sequence the libraries with 150 bp paired-end reads. The initial quality of the raw sequencing data was checked using FastQC [3] Trim galore 0.6.5 was used to trim the raw reads and adaptor con-tam-inations [4] , and the Unicycler v0.4.8 was used for initial assembly [5] . Unless otherwise indicated, default parameters were used for all software. The NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) [6] and the RAST server [7] were used for annotation. CGView server was used to create and visualise a graphical circular map of the entire genome [8] .

16S rRNA gene sequence and phylogenetic analysis
RNAmmer software was used to extract the 16S rRNA gene sequence of strain PNP from its genome [9] . EzBiocloud evaluated the 16S rRNA gene sequence of strain PNP to determine its more closely related type strains [10] . The EzBiocloud database was used to obtain the 16S rRNA gene sequences of all closely related species. ClustalW was used to align all of the sequences [11] . The MEGA X software package was used to create a phylogenetic tree using the neighbour joining method [12] .

Average nucleotide identity and digital DNA-DNA hybridization
The OrthoANI algorithm was used to calculate average nucleotide identity (ANI) amongs genomes of strain PNP [13] and its closely related species, and digital DNA-DNA hybridization (dDDH) values were calculated using genome-to-genome distance calculator (GGDC) 2.1. [14]