Ligilactobacillus salivarius 2102-15 complete genome sequence data

The article presents Ligilactobacillus salivarius 2102-15 whole genome sequencing data generated by using Illumina and Oxford Nanopore platforms. The genome of the isolate consists of a chromosome and two plasmids. The data on bacteriocin-encoding genes present in the genome were collected through genome annotation and by using a BAGEL4 tool. The advantages and limitations of the approaches are highlighted. The data indicate the presence of different types of bacteriocin and immunity protein-encoding genes on both the chromosome and one of the plasmids. The data obtained represents interest to researchers working in the areas related to whole genome sequencing and analysis, as well as being useful for the identification of novel probiotic bacteria and their biomedical applications.


Subject
Biological Sciences; Omics: Genomics Specific subject area Genomics, applied microbiology Type of data Raw sequencing reads Assembled genome Annotated genome Tables Figures

Value of the Data
• The data contain information about the complete genome sequence of Ligilactobacillus salivarius 2102-15 and are useful for a potential application of this strain as a probiotic.• The data would benefit researchers involved in comparative genomics studies.
• The data on bacteriocin-related genes are of particular interest for their exploration for the development of novel antimicrobial strategies for fighting infections caused by multidrug resistant bacteria.• The data on bacteriocin-encoding genes of both chromosomal and plasmid origin, and a strategy used for their identification, would assist scientists studying the genetic origin and application of these antimicrobial compounds.

Objective
The aim of this article is to provide data on complete genome sequence of Ligilactobacillus salivarius 2102-15 containing a chromosome and two plasmids, as well as raw genome sequencing data including the number and sizes of long and short sequencing reads used for genome sequence assembly, as well as specific details of the genome assembly, including data on bacteriocin-encoding genes.

Data Description
Ligilactobacillus salivarius 2102-15 strain was isolated from vaginal exudate of a healthy woman of reproductive age.L. salivarius 2102-15 strain was cultivated in De Man, Rogosa, and Sharpe (MRS) broth and agar (HiMedia, India) under anaerobic conditions.The isolate identification on the genus and species levels as L. salivarius was performed by using MALDI-TOF-MS method [1] .The woman was selected during a preventive medical examination among a group of patients of reproductive age planning the birth of a child.
This article reports data on the complete genome sequence of this isolate containing a chromosome and two plasmids.The data on bacteriocin-related genes found in this genome are provided in more details as they may contribute to specific properties of this isolate.The genome sequence of Ligilactobacillus salivarius 2102-15 is available at NCBI GenBank under the accession numbers CP090411:CP090413.The total size of the genome is 2,017,204 bp with an average GC content of 33.07 % ( Table 1 ).In addition to a circular chromosome (1,834,593 bp), the strain contains two circular plasmids (140,826 bp and 41,785 bp) ( Figs. 1 -3 ).The data demonstrate the presence of several genes encoding different bacteriocin-related proteins, involved in the production and immunity to these compounds.There are genes encoding compounds similar to salivaricin, nisin and enterolysins.These data (presented in Table 2 ) were obtained by using two different approaches: via keyword searches of the genome annotation generated by NCBI GenBank, and by an application of a bacteriocin gene detection software BAGEL4.Whilst two chromosomal genes encoding enterolysin A were detected by the latter, the corresponding gene products were annotated by GenBank as a 'phage tail' protein (locus tag LZF92_01565) and a 'peptidoglycan DD-metalloendopeptidase family protein' (locus tag LZF92_08285).The genes encoding these proteins are located at positions 319,667 bp to 321,982 bp and 1,626,121 bp to 1,629,057 bp respectively.The data presented in Fig. 4 demonstrate location of the former in an intact prophage, spanning a region between 287 kb and 328 kb.Additional data regarding genes encoding bacteriocins are presented Table 2 and Figs. 1 and 2 showing chromosome-located genes encoding enterolysin A, and the genes encoding lantibiotics (a and b chains of salvaricin,     and nisin) present on a larger plasmid.The lantibiotics and bacteria producing them were found to be useful for fighting infections, especially those caused by multidrug resistance bacteria [2] .
There is also an extra bacteriocin-related gene found by PGAP but not by BAGEL4.Overall, the data demonstrate importance of using a combination of different tools for the identification of bacteriocin-related genes.

Sample Preparation, Genome Sequencing and Annotation
The strain was grown on MRS agar (Oxoid, USA) for 24 h at 37 °C in an anaerobic atmosphere (5 %hydrogen, 10 % carbon dioxide and the rest nitrogen).Bacteria were resuspended in Tris-EDTA buffer with lysozyme (0.1 mg/ml) and RNase A (0.1 mg/ml), incubated for 25 min at 37 °C, followed by addition of proteinase K and SDS to 0.1 mg/ml and 0.5 %, respectively.After incubated for at 65 °C for 5 min, genomic DNA was purified using reversible immobilization (SPRI) beads (Beckman, USA) and resuspended in elution buffer (Qiagen, Germany).
The complete genome sequencing was performed by using short and long sequencing reads produced by Illumina and Oxford Nanopore sequencing platforms respectively.Short reads sequencing DNA library was prepared using the Nextera XT library prep kit (Illumina, San Diego, CA) following the manufacturer's protocol.Short reads were generated by using the Illumina NovaSeq 60 0 0 platform.Long reads genomic DNA libraries were prepared with the Oxford Nanopore Technologies (ONT; United Kingdom) SQK-LSK109 kit with the native barcoding EXP-NBD104/114 (ONT) kit.Sequencing was done with a FLO-MIN106 (R.9.4.1) flow cell in a Grid-ION system (ONT).Illumina reads were adapter trimmed using Trimmomatic 0.30 with a sliding window quality cutoff of Q15 [3] .All sequencing reads were assembled by using Unicycler v.0.4.9b [4] .The whole genome assembly coverage was 80.0x.The assembly was annotated by the NCBI Prokaryotic Genome Annotation Pipeline [5] using the best-placed reference protein set and GeneMarkS-21.The complete genome sequence is represented by three circular molecules: chromosome (1,834,593 bp) and two plasmids (140,826 and 41,785 bp), graphic images of which were generated using an on-line Proksee server [6] available at https://proksee.ca/ .Gene clusters involved in bacteriocin biosynthesis and immunity were identified using BAGEL4 [7] software available at http://bagel4.molgenrug.nl/ .Search for prophages was performed by using an online tool PHASTER [8] available at http://phaster.ca/ .The online tools Proksee, BAGEL4 and and PHASTER were run on 04/0 6/2023, 10/0 6/2023 and 15/06/2023, respectively.Default parameters were used for all software unless otherwise specified.

Ethics Statement
Not applicable as none of the experiments described in this report involved any human or animal specimens.

Fig. 1 .
Fig. 1.A, circular map Ligilactobacillus salivarius 2102-15 chromosome; GC skew + and GC skew-strains are designated with green and pink respectively.Positions of rRNA are indicated with green color.B, identification of two chromosomal gene clusters containing genes encoding enterolysin-line proteins using BAGEL4 software.

Fig. 2 .
Fig. 2.A, genetic map of plasmid plasmid pLS2102-15_1; GC skew + and GC skew-strains are designated with green and pink respectively.B, identification of two chromosomal gene clusters containing genes encoding enterolysin-line proteins using BAGEL4 software.Two light-green ORFs encode two chain of bacteriocin salvaricin.Dark green ORF 0 0 022 in this cluster encode another bacteriocin, also annotated as such in GenBank.However, unlike the latter, BAGELS failed to identify ORF 0 0 023 as a bacteriocin-encoding gene.

Table 2
Selected putative gene products related to bacteriocin biosynthesis and immunity in Ligilactobacillus salivarius 2102-15.