Characterisation data of simple sequence repeats of phages closely related to T7M

Coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2 share high homology in genomic sequences. Simple sequence repeats (SSRs) are found in their genomes and variations of SSRs among these phages are observed. Analyses on regions of sequences in T7M and T3 genomes that are likely derived from phage recombination, as well as the counterparts in ϕYeO3-12 and ϕSG-JL2, have been discussed by Lin in “Simple sequence repeat variations expedite phage divergence: mechanisms of indels and gene mutations” [1]. These regions are referred to as recombinant regions. The focus here is on SSRs in the whole genome and regions of sequences outside the recombinant regions, referred to as non-recombinant regions. This article provides SSR counts, relative abundance, relative density, and GC contents in the complete genome and non-recombinant regions of these phages. SSR period sizes and motifs in the non-recombinant regions of phage genomes are plotted. Genomic sequence changes between T7M and T3 due to insertions, deletions, and substitutions are also illustrated. SSRs and nearby sequences of T7M in the non-recombinant regions are compared to the sequences of ϕYeO3-12 and ϕSG-JL2 in the corresponding positions. The sequence variations of SSRs due to vertical evolution are classified into four categories and tabulated: (1) insertion/deletion of SSR units, (2) expansion/contraction of SSRs without alteration of genome length, (3) changes of repeat motifs, and (4) generation/loss of repeats.

Simple sequence repeats T7M Bacteriophage genome SSR variability classification a b s t r a c t Coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2 share high homology in genomic sequences. Simple sequence repeats (SSRs) are found in their genomes and variations of SSRs among these phages are observed. Analyses on regions of sequences in T7M and T3 genomes that are likely derived from phage recombination, as well as the counterparts in ϕYeO3-12 and ϕSG-JL2, have been discussed by Lin in "Simple sequence repeat variations expedite phage divergence: mechanisms of indels and gene mutations" [1]. These regions are referred to as recombinant regions. The focus here is on SSRs in the whole genome and regions of sequences outside the recombinant regions, referred to as non-recombinant regions. This article provides SSR counts, relative abundance, relative density, and GC contents in the complete genome and non-recombinant regions of these phages. SSR period sizes and motifs in the non-recombinant regions of phage genomes are plotted. Genomic sequence changes between T7M and T3 due to insertions, deletions, and substitutions are also illustrated. SSRs and nearby sequences of T7M in the nonrecombinant regions are compared to the sequences of ϕYeO3-12 and ϕSG-JL2 in the corresponding positions. The sequence variations of SSRs due to vertical evolution are classified into four categories and tabulated: (1)

Subject area
Biology More specific subject area

Genome evolution and sequence mutations
Type of data Figure, tables How data was acquired

Data format Analyzed Experimental factors
Genome sequences were retrieved from NCBI for analysis.

Experimental features
Software (ClustalW, IMEx) and manual analysis of the sequences, manual characterization and analysis Data source location National Chiao Tung University, Hsinchu, Taiwan

Data accessibility
Data are within this article.

Value of the data
Revealing different types of sequence changes of SSRs by vertical evolution of genomes. Detailed SSR distributions may aid in identifying broader patterns of phage evolution. Provides a guideline for classification of SSR variations in genome comparisons.
Variations of SSRs in phages may be applied to phage typing. Assists researchers studying T7M, T3, ϕYeO3-12, and ϕSG-JL2 related phages in making sequence comparisons.

Data
Fig . 1 plots the distribution of SSR period sizes and motifs in the non-recombinant regions of the genomes of phages T7M, T3, ϕYeO3-12, and ϕSG-JL2. Table 1 illustrates differences in genomic sequences between T7M and T3. Tables 2 and 3 provide SSR counts, relative abundance, relative density, and GC contents in the complete genomes and non-recombinant regions for T7M, T3, ϕYeO3-12, and ϕSG-JL2. The four classes of SSR variations, (1) insertion/deletion of SSR units, (2) expansion/ contraction of SSRs without alteration of genome length, (3) changes of repeat motifs, and (4) generation/loss of repeats, in T7M non-recombinant regions relative to counterpart regions of ϕYeO3-12 and ϕSG-JL2 are tabulated in Tables 4-9.

Genome sequences and recombinant regions
The genome sequence of T7M is in NCBI under the accession number GenBank: JX421753 [1]. Genome sequences of ϕYeO3-12, ϕSG-JL2, and T3 are acquired from GenBank accession numbers          GenBank: AJ251805 [2], GenBank: NC_010807 [3], and GenBank: AJ318471 [4], respectively. Sequences were aligned by ClustalW [5], and differences between phages are compared. The T7M sequence nt 13245-16687 and 26695-35789 align to T3 nt 13243-16685 and 26700-35794, respectively, and likely arise from a recombination between a ϕYeO3-12-like phage and a T7-like phage, as suggested for T3 [4]. These regions and the counterparts in ϕYeO3-12 and ϕSG-JL2 are referred to as recombinant regions, and the rest of the genomes are referred to as non-recombinant regions [1].

Simple sequence repeats
Simple sequence repeats were searched in phage genomes or non-recombinant regions by IMEx [6]. Unless otherwise specified, the minimum repeat units for mono-to hexanucleotide were 5, 3, 3, 2, 2, 2. Repeats sequences were not standardized.