Draft genome sequence dataset of Latilactobacillus curvatus PN39MY isolated from fermented vegetables

Here we report the draft genome sequence of the Latilactobacillus curvatus PN39MY strain. The strain was isolated from Lithuanian traditionally (homemade) fermented cucumber. The genome consisted of 83 contigs with a total size of 1,899,018 bp, an N50 of 40562 and a GC% of 42.1. After sequence trimming, 83 contigs were annotated and 1910 genes were coding sequences. The average nucleotide identity (ANI) between PN39MY and Latilactobacillus curvatus_ZJUNIT8 was 99.45% identifying the strain as Latilactobacillus curvatus. No genes related to antimicrobial resistance or virulence factors were found. The data presented here can be used in comparative genomics to identify antimicrobial resistant genes, plasmids and/or virulence factors that may be present in related Latilactobacillus species. The draft genome sequence data was deposited at NCBI under Bioproject with the accession number PRJNA941180.


Value of the Data
• Latilactobacillus curvatus is a lactic acid bacterium commonly used for food fermentation and to improve human and animal health. The sequencing data described here unveils the identity and safety-related characteristics of Latilactobacillus curvatus PN39MY, a valuable fermented food isolate. • The information presented for Latilactobacillus curvatus PN39MY is significant because the bacterium possessed none of the antimicrobial resistance genes such as macB, vanHD, vanL, baeR, baeS, mfd, lmrC, lmrD, salA, lsaA, rpsJ, tetT, tetM, efrA, efrB, tetA, vanL, TaeA, vgaD and rpsJ present in many other L. curvatus isolates reported in previous studies [1] . • The sequencing data and the described microbial bioinformatics workflow can be applied in comparative genomics and in the search for virulence factors, antibiotic resistance genes as well as plasmids in related lactic acid bacteria species.

Objective
Latilactobacillus curvatus is commonly found in fermented foods. Meanwhile, though most of them are generally regarded as safe due to their long history of use without reported risks, they may acquire antimicrobial resistance and virulence genes from the environment which can be transferred to pathogens. For this reason, we analyzed the whole genome sequence of Latilactobacillus curvatus PN39MY to obtain insights into possible antimicrobial resistance genes and virulent factors it might possess after it demonstrated the best ability (among 18 other lactic acid bacteria isolated from Lithuanian homemade pickles) to generate fermented beet root with antidiabetic potentials (data not shown).

Data Description
Here, we report the draft genome sequence data of Latilactobacillus curvatus PN39MY including its potential antimicrobial resistance and virulence factors. The genome consisted of 83 contigs with a total size of 1,899,018 bp, an N50 of 40562 and a GC% of 42.1 ( Fig. 1 ). .2023] with a median total length is 1,920,030 bp, a median protein count of 1844 and a median GC% of 41.9. This implies that PN39MY sequencing yielded a complete genome (98.91%) comparable in size to the median length expected. The Average Nucleotide Identity (ANI) between PN39MY and Latilactobacillus curvatus _ZJUNIT8 was 99.45% (supplementary data) indicating that PN39MY is a Latilactobacillus curvatus strain . The genome were screened against different antimicrobial gene databases namely; NCBI Bacterial Antimicrobial Resistance Finder Plus, ResFinder 4.1 and Comprehensive Antibiotic Resistance Database [2] . No antimicrobial resistance genes or virulent genes were identified. Also, no pathogenic factors nor plasmids were detected in the genome. The draft genome sequence dataset was deposited at the NCBI Genebank with accession number JARGYE0 0 0 0 0 0 0 0 0.

DNA Extraction and Whole Genome Sequencing
The QIAGEN DNeasy PowerSoil Pro Kit was used to isolate DNA from the bacteria following the manufacturer's protocol. Briefly, the quantity of DNA in the samples were measured using the QuantiFluor® dsDNA System chemistry with the GloMax Plate Reader System. To prepare the DNA libraries, Nextera XT DNA Library Preparation Kit was used coupled with IDT Unique Dual Indexes. An amount of 1ng DNA was used as input. The genomic DNA was fragmented using Illumina Nextera XT fragmentation enzyme and a unique dual index was added to the sample. The library was then constructed using 12 cycles of PCR. The DNA library was purified using AMpure magnetic beads and eluted in QIAGEN EB buffer. The quantity of DNA in the library was measured with a Qubit 4 fluorometer and a Qubit dsDNA HS Assay Kit. The library was then sequenced with the Illumina NovaSeq platform with 2 × 150bp reads.

Taxonomic Identification of the Strain
Raw paired end reads were trimmed and processed using BBDuk, with a read quality trimming parameter of 22. SPAdes was then used to assemble the fastqs with the "-careful" parameter. The lineage_wf function in CheckM was used to assess the completeness of the assemblage. CosmosID core genome SNP typing pipeline was used to examine the assembled contigs, and to draw epidemiological conclusions based on phylogenetic placement and SNP variations. Parsnp was used as the core genome aligner to map the core genome of several microbial genomes. The final set of core-genome SNPs was used to rebuild the phylogenomic relationship among the genome using FastTree2, as shown in Fig. 2 .   Fig. 2. Phylogenetic tree of PN39MY bacterium. FastTree2 was used to reconstruct the phylogenetic relationship from the core-genome SNPs. The GenBank accession numbers are indicated in the phylogenetic tree.

Search for Antimicrobial Resistance Genes and Virulence Factors
For antimicrobial resistance genes (AMR) and virulence factor (VF) detection, the assembled genome of PN39MY was compared to the Resfinder AMR and VFDB VF database [3] using ABRicate version 1.0.1. AMR and VF genes were considered present only if the sequences matched the assembled genome at a nucleotide identity > 90% and the alignment coverage of the gene's sequence length was > 60%. The average nucleotide identity (ANIm) between isolates was calculated using MUMmer [4] . Prokka Annotation Pipeline [5] was then used to annotate the genome.

Search for Plasmids
The contigs from the genome data were compared to the PlasmidFinder database [6 , 7] in order to identify any plasmids present.

Ethics Statement
No animal or human studies were conducted in this work.

Declaration of Competing Interest
The authors declare that they have no conflict of interests that could have appeared to influence the work reported in this paper.