Revised whole genome and DNA methylome of Mycobacterium marinum type strain ATCC 927T

ABSTRACT Mycobacterium marinum, a slow-growing Actinobacterium, typically induces tuberculosis-like disease in fish. Here, we report a new reference sequence for M. marinum ATCC 927T, along with its DNA methylome. This aims to maximize the research potential of this type strain and facilitates investigations into the pathomechanisms of human tuberculosis.

FIG 1 (A) Whole genome alignment of strain ATCC 927 to its previously sequenced passage variant CCUG 20998.ProgressiveMauve (27) was used to align the indicated genomes in a step-by-step manner for the identification of conserved regions and rearrangements between the two genomes.Significant differences are indicated with arrows and numbered as follows.1, a truncated ABC transporter permease (QDR78_11875 versus intact CCUG20998_RS12005) due to a sequence deletion, 2, a truncated PE family protein due to a frameshift (QDR78_13350 versus CCUG20998_RS28445); pseudogenes on both genomes.
3, an insertion of six genes (CCUG20998_RS15230-CCUG20998_RS15255) coding for an IS3 family transposase, recombinase (a pseudogene), cytochrome P450, TetR/AcrR family transcriptional regulator, a frameshifted TetR/AcrR family transcriptional regulator, and a frameshifted IS256 family transposase.These genes correspond to plasmid-located genes in ATCC 927 T (QDR78_27190-QDR78_27215) coding for TetR/AcrR family transcriptional regulator, TetR/AcrR family transcriptional regulator, cytochrome P450, DUF308 domain-containing protein (a membrane-associated protein), a pseudogene recombinase family protein, and an IS3 family transposase, respectively.4, a truncated non-ribosomal peptide synthase/polyketide synthase, which is also split into two parts: CCUG20998_RS28465 (AMP-binding protein, 322 amino acids) and CCUG20998_RS28470 (amino acid adenylation domain-containing protein, 1,468 amino acids), versus QDR78_13590 (9,908 amino acids).5, another non-ribosomal peptide synthase/polyketide synthase: CCUG20998_RS15760 (12,024 amino acids) versus QDR78_15505 (9,880 amino acids).(B) Active and predicted R-M systems found in strain ATCC 927 T , along with matched methylation motifs detected through SMRT whole methylome analysis.The motifs GTAYNNNNATC and CTGGAG and their corresponding R-M systems are also found in the Mycobacterium marinum strain MMA1 (REBASE org.#46029).Other PacBio-sequenced strains contain either the first (strain 050012) or the second (Mycobacterium marinum strains E11 and H01) of the R-M system-motif pairs, whereas the R-M system with recognition motif CGACNNNNNNCTGG is unique, with modified adenines (indicated as T for complementary strand) marked in blue.Strain MMA1, a human-infection-associated isolate (26), also possesses an additional Type I system with recognition motif CCACNNNNNNNTCCC that is not found in other strains.No activity was detected for the prophage-encoded methyltransferases.#, numbers of detected m6A type modifications (%, number of detected m6A modifications per total number of all detected modifications).

TABLE 1
Comparison of the whole genome details of the Mmr ATCC 927 T sequenced here with the same strain sequenced earlier by Nanopore/Illumina (NI) (25) and its passage variant CCUG 20998 using PacBio SMRT (PB) (26) e

Genome
Plasmid Origin

Name of the strain Accession no. Size (bp) G + C (%)
No.

Genes
No.

pgenes a
No.

Genes
No.

tRNAs G + C (%)
ATCC 927 T (NI) NZ_AP018496. 1  e For an accurate comparison, all previously generated whole-genome sequences were obtained from the RefSeq database, where they were subjected to PGAP re-annota tion.
central genomic data of ATCC 927 T in comparison to other related Mmr genomes.Figure 1 illustrates the major differences between ATCC 927 T and its passage variant CCUG 20998, also defined by PacBio SMRT (25), as well as indicates the methylation motifs with their corresponding active and predicted R-M systems.In summary, this study provides a new reference sequence for ATCC 927 T and reports the first in-depth DNA methylome analysis for this non-tuberculous bacterial model.
Due to the lack of plasmid sequence at NCBI, the annotation was not updated from the original.d The difference in length is a result of the accumulation of sequencing errors, particularly in regions of homopolymeric stretches, as determined by BLASTN 2.15.0 + nucleotide alignment.
a pgenes, pseudogenes.b NR, not reported.c