Transcriptomic dataset of wild type and phoP mutant Pectobacterium versatile

RNA-Seq transcriptome data for the wild type and phoP mutant strains of Pectobacterium versatile is described. P. versatile is a recently introduced name for a species of plant pathogenic bacteria that unites a group of strains previously embedded within the Pectobacterium carotovorum clade [1,2]. Little detail is available about how this pathogen adapts to changing environmental conditions, including those within its host plant. The PhoP/PhoQ two-component system is an important sensor responding to several stimuli and is present in most species of enteric bacteria. It usually controls large regulons, which vary greatly even between closely related species [3]. This dataset enables the discovery of the genes under direct or indirect transcriptional control by PhoP in P. versatile and should help to understand the physiology of this plant pathogen.


Specifications
Value of the Data • This dataset is, to our knowledge, the first RNA-seq one for P. versatile and will be valuable for the Pectobacterium sp. research community for characterizing the highly divergent regulon controlled by the global transcription factor PhoP. • The data may be useful for researchers studying the adaptation of P. versatile to changing environment, including plant colonisation. • The data can be used to define PhoP regulon and to establish PhoP role in the control of P. versatile virulence. • This dataset can be used to study operon organisation in P. versatile.

Data description
The dataset contains sequencing data obtained through the transcriptome sequencing of two P. versatile strains: JN42 and its phoP mutant derivative UK1 grown in the synthetic medium supplemented with polygalacturonic acid. Samples for transcriptome profiling were collected at the exponential growth phase. FASTQ files were deposited in NCBI Sequence Read Archive and are accessible through the BioProject PRJNA627079. Information about bacterial culture samples, statistics of sequence reads and sequence coverage data is shown in Table 1 . PCA plot of RNA-seq data presented in Fig. 1 demonstrates the variance between sample groups and sample replicates according to gene expression levels. Each dot in the Fig. 1 indicates a particular sample.

RNA isolation, cDNA library preparation and sequencing
Bacterial cells in mid-log phase cultures were fixed by adding phenol/ethanol (1/20 v/v) solution to 20% and kept on ice for 30 min. The fixed cells were harvested (80 0 0 g, 5 min, 4 °C) and resuspended in 1 mL of ExtractRNA Reagent (Evrogen, Russia) and the subsequent procedures were performed according to the manufacturer's instructions. Residual DNA was eliminated by treatment of RNA samples with DNAse I (Thermo Fischer, USA). Total RNA was processed using Ribo-Zero rRNA Removal Kit (Gram-Negative Bacteria) (Illumina, USA) and NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, USA) according to manufacturer's instructions. The quality and quantity of the cDNA libraries during processing before sequencing were monitored using the Agilent 2100 Bioanalyser (Agilent Technologies, USA) and CFX96 Touch Real-Time PCR Detection System (Bio-Rad Laboratories, USA). Sequencing was conducted by a HiSeq 2500 Sequencing System (Illumina) at Joint KFU-Riken Laboratory, Kazan Federal University (Kazan, Russia).

Reads alignment to the reference genome
The reads were mapped onto the genome sequence of P. versatile strain 3-2 (GenBank accession CP024842) which is the wild type parent of the laboratory strain JN42. BWA version 0.7.16a [7] was used to build the index of the reference genome and align the reads to the reference genome with default aligner parameters. SAM files of alignments created by BWA were converted to sorted BAM files with SAMtools v. 1.10 [8] using samtools sort command. Reads mapping statistics are presented in Table 1 .

Author's contribution and ethics statement
Natalia Gogoleva: Investigation, Methodology. Uljana Kravchenko: Investigation, Software. Yevgeny Nikolaichik: Conceptualization, Data curation, Writing -Original draft preparation, Review & Editing. Yuri Gogolev: Conceptualization, Supervision, Writing -Original draft preparation, Review & Editing, Funding acquisition. All ethical requirements for such studies were observed in the preparation of the publication. The work was not related to the use of human objects and did not include experiments with animals.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.

Acknowledgments
This work was supported by grants from Russian Foundation for Basic Research (project no. 18-54-0 0 021 ) and Belarusian Republican Foundation for Basic Research. The RNA-Seq analysis was supported by the Russian Science Foundation (project no. 17-14-01363 ). The cDNA-library preparation was worked out at the financial support of the Ministry of Science and Higher Education of the Russian Federation (grant no. 075-15-2019-1881 ). DNA sequencing was performed within the frameworks of the government assignment for FRC Kazan Scientific Center of RAS.
The study was carried out by using the equipment of the CSF-SAC FRC KSC RAS.

Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.dib.2020.106123 .