Pediococcus pentosaceus IMI 507025 genome sequencing data

The genome sequence data for the pickled cucumbers isolate, Pediococcus pentosaceus IMI 507025, is reported. The raw reads and analysed genome reads were deposited at NCBI under Bioproject with the accession number PRJNA814992. The number of contigs before and after trimming were 17 and 12 contigs, respectively. The total size of the genome was 1,795,439 bp containing 1,811 total genes, of which 1,751 were coding sequences. IMI 507025 identity was determined via average nucleotide identity (ANI), obtaining an identity value of 99.5994% between IMI 507025 and the type strain P. pentosaceus ATCC 33316, identifying the strain as P. pentosaceus. Screening for the antimicrobial resistance (AMR) and virulence genes in the genome of IMI 507025 showed no hits, confirming the safety of the tested strain. Presence of plasmids was not found.


Specifications
istics and strain identity. • The sequencing data could be used for Pediococcus comparative genomics, and for evaluation of genes of concern among lactic acid bacteria members.

Data Description
The whole genome sequencing data of Pediococcus pentosaceus ( P. pentosaceus ) IMI 507025, the taxonomic identification data, genome screening for AMR, virulence factors and plasmids related data are described.
The whole genome sequencing coverage was 1020x. The annotated assembly consisted of 12 contigs with a total length of 1,794,629 bp, a GC% of 37.03, N50 contig of 354,566 bp. The annotation produced 1811 genes, of which 1751 were coding sequences, 53 RNA genes (2 ribosomal RNAs, 47 transfer RNA and 4 miscellaneous RNA) and 7 pseudogenes.
The genome comparison showed the best hit (low distance and high matching) to Pediococcus pentosaceus CGMCC 7049 ( Table 1 ).
In the Table 2 . are summarised the genomes that were included in the comparison study via orthoANI.
In Table 3 . is reported the outcome from the comparison of IMI 507025 with closely related P. pentosaceus strains. The pairwise comparisons showed 99.6397% identity between IMI 507025 and P. pentosaceus CGMCC 7049 genomes. The ANI match with the P. pentosaceus type strain ATCC 33316 was 99.5994%. The species identification cut off is set as 95% [2] .  The threshold values for AMR and virulence genes screening, were considered the once proposed by the European Food Safety Authority (EFSA), namely sequences with above 80% identity and 70% coverage should be considered for further analysis [2] . The genome searches revealed no AMR genes nor virulence or pathogenicity factors presence in the sequenced genome of the strain IMI 507025. The bioinformatic analysis did not identified putative plasmids in the sequenced data.
Based on the data presented above, the strain IMI 507025 was unequivocally identified as Pediococcus pentosaceus . In addition, the safety-related data described, confirm that the strain P. pentosaceus IMI 507025 is safe and did not raise safety concerns.

DNA Extraction
For the DNA extraction, 10 mL MRS Broth cultures were incubated aerobically at + 30 0 C for 16-17 h. The cells were centrifuged (1780 rcf, 10 min) and the pellet was used for DNA extraction according to previously described procedure [8] .

Whole Genome Sequencing, Assembly, and Annotation
The DNA was sequenced using Illumina NovaSeq 60 0 0, 150 bp paired-end library, sequencing technology at Eurofins genomics (Constance, Germany), obtaining 6,688,243 reads. Trimmomatic v.0.38.1 [3] was used for trimming the reads and Unicycler v 0.4.8 [4] for assembling. The average reference coverage (total number of bases / assembly length) of the assembly was 1020-fold. Gene predictions and functional annotations were performed using NCBI Prokaryotic Genome Annotation Pipeline v6.0 [5] .

Taxonomic Identification
Mash using MinHash v. 0.1.1 [6] and OrthoANI v. 1.40 [7] were used for strain identification via alignment-free genome distance estimation and calculating of average nucleotide identity.

Screening for AMR and Virulence Factors Related Genes
Two databases were used for AMR genes search, the NCBI Bacterial AMR Reference Gene database (v. 2021-06-01.1) and the ResFinder database (downloaded on 20.04.2021). Screening for virulence factors was performed using the virulence factor database (VFDB). Default parameters were used except where otherwise stated in previously published study [8] .

Screening for Plasmids
PlasmidFinder database [9] and Blast searches were used for search for plasmid related contigs in the sequenced genome, the circular contigs presence was examined in the assembly files.

Declaration of Competing Interest
The authors I.N and C.A.M. are employees of Alltech which produces Pediococcus pentosaceus IMI 507025 evaluated in this study.