Genome Sequences of Newly Isolated Mycobacteriophages Forming Cluster S

We describe the genomes of two mycobacteriophages, MosMoris and Gattaca, newly isolated on Mycobacterium smegmatis. The two phages are very similar to each other, differing in 61 single nucleotide polymorphisms and six small insertion/deletions. Both have extensive nucleotide sequence similarity to mycobacteriophage Marvin and together form cluster S.

cobacterial host. Mycobacterium smegmatis is currently a host to Ͼ6,900 phages (1). Phage populations are numerous and diverse: it is estimated that 10 31 different particles exist (2,3). Based on shared gene content, these phages are divided into 26 clusters, with singletons that do not belong to any clusters. The genomes in this report establish cluster S, one of the smaller clusters, making up 0.2% of current known phages and currently containing no subclusters (1,2). Marvin was the first discovered cluster S phage and remained a singleton until the discovery of two similar phages: MosMoris and Gattaca (see GenBank nucleotide sequence accession numbers in Table 1).
One of the phages, Marvin, was isolated in Radnor, PA; the other two, MosMoris and Gattaca, were isolated in Milbridge, ME (1), from soil samples in different years by direct plating. Once isolated, the phages form plaques on a lawn of M. smegmatis mc 2 155. Plaques were picked, purified, and amplified, and the DNA was extracted. Marvin, MosMoris, and Gattaca were sequenced at the Joint Genome Institute (Sanger sequencing) and the Pittsburgh Bacteriophage Institute (Ion Torrent and Illumina sequencing), respectively.
After sequencing, reads were assembled using Newbler and Consed. The DNA Master software (http://cobamide2.bio.pitt .edu) was used to annotate the genomes with an integrated approach using Glimmer, GeneMark, BLAST, and Shine-Dalgarno positioned weight scores. Phamerator (4) and HHpred (5) were used to predict the function of hypothetical proteins.
These cluster S mycobacteriophages display a morphotype of Siphoviridae and have an average GϩC content of 64.3% (1). The average length of members of cluster S was 65,193 bp (1), and they are 99% identical to each other (2). Each phage in the cluster has a 3= sticky overhang of 11 bp that contains a sequence of GCGCGCA GCGC (1).
The three genomes have an average of 111 open reading frames (ORFs) ( Table 1). All of the genes are forward-transcribed genes except for 10 to 11 ORFs near gp100 and two single leftwardstranscribed genes in the left parts of the genomes: a DNA methylase and gene of unknown function. Many of the genes, particularly in the rightmost parts of the genomes, are of unknown function. However, the following were identified by homology: terminase, tape measure protein, minor tail protein, lysin A, lysin B, holin, WhiB, exonuclease, hydrolase, methyltransferase, galactosyl transferase, and a glycosyl transferase. The genes are syntenic and tightly packed, except for the region where the direction of transcription changes. MosMoris and Gattaca both contain insertions of an HNH endonuclease in the minor tail genes that is absent in Marvin. In addition, Marvin has a putative insertion with low coding potential near 9,500 bp that is not seen in the other two cluster members.
The S cluster's closest relationships by nucleotide similarity map are two singletons, Wildcat and Sparky (2), two phages with low similarity to other phages. The closest clusters to the S cluster are the M and T clusters. crossmark Accession number(s). The whole-genome sequences have been deposited at GenBank under the accession numbers listed in Table 1.