Genomic Insights into Methicillin-Resistant Staphylococcus aureus spa Type t899 Isolates Belonging to Different Sequence Types

This study showed the genetic diversity and population structure of S. aureus presenting the same spa type, t899, but belonging to different STs. Our findings revealed that these isolates vary deeply in their core and accessory genomes, contrary to what is regularly inferred from studies using spa typing only.

and smaller CC9 region with a spa gene (9). CC9 differs from the European pig-associated CC398 with regard to clonal type, staphylococcal cassette chromosome mec element (SCCmec) content, and resistance profile (4).
The CC9/CC398 hybrid strain has been identified from livestock (9, 10) and related personnel (11,20) in several European countries. Furthermore, two CC398 LA-MRSA spa type t899 isolates were recently reported as incidental findings during a clinical investigation from turkey and pheasant in the United Kingdom (10,21). These two isolates were shown to belong to the CC9/C398 hybrid genotype and were quite similar to the clone that was reported from continental Europe (10,13,22). Interestingly, t899 has been increasingly associated with several single-locus variants (SLVs) of CC398 and CC9 (16)(17)(18).
Overall, the occurrence of the same spa types in distant lineages has been reported, resulting from either convergent evolution or genetic recombination (23,24). As an example, CC239 MRSA is a hybrid strain of CC30 (founder, ST30) and CC8 (founder, ST8) (23). ST34 and ST42 backgrounds have also been suggested to be of hybrid origin (24). Recently, t304 isolates belonging to ST6, ST1649, ST8, and ST4290 were reported by Bartels et al. (25).
Since spa typing is still largely used as the unique typing method, for example in large surveillance studies and in low-income countries, our aim was to characterize t899 isolates using single nucleotide polymorphism (SNP)-based phylogeny on publicly available genomic data and associated metadata in order to identify markers that could be implemented in an easy and inexpensive manner in order to identify LA-MRSA lineages with a higher accuracy.

RESULTS
Thirty-four LA-MRSA genomes of t899 isolates were analyzed, of which 20 belonged to ST398, 13 to ST9, and 1 to ST4034 (a single-locus variant [SLV] of ST398, differing by one substitution [A 294 T] in the arcC gene). Metadata associated with the selected isolates were recorded (Table 1). All t899 isolates harbored the mecA gene on a SCCmec IVa(2B) element, except for two which presented either the SCCmec V element or an undefined cassette. Both the spanning tree and the SNP-based phylogenetic tree ( Fig.  1B and 2) confirmed a strong clustering according to the ST, with the ST4034/t899 isolate differing from ST398/t899 by only 41 core alleles. Other characteristics, including matrix/sample origin, did not appear to cluster in this SNP analysis.
The phylogenetic tree (Fig. 2) also showed major divergences in the antimicrobial resistance and virulence patterns depending on the ST. All ST398 isolates carried the tet(M) gene, while none of the ST9 isolates carried this tetracycline resistance gene (Fig. 2). Concerning virulence markers, most ST398 spa type t899 isolates harbored the scn and sak genes, indicating the presence of the IEC cluster, while ST9 isolates were devoid of the IEC cluster but systematically harbored the seg, sei, sem, sen, and seu genes, encoding enterotoxin-like proteins (Fig. 2).

DISCUSSION
In this study, the SNP-based phylogeny analysis was consistent with the core genome multilocus sequence typing (cgMLST) analysis, with t899 isolates clustering apart based on STs. Although spa typing has a remarkable predictive power over clonal relationships, predicting genetic relatedness based on spa type does not appear appropriate for isolates that have undergone major recombination events, including spa gene passages (26)(27)(28)(29)(30)(31). This important genomic recombination is not frequent in S. aureus, and the major representatives of such events are ST239, ST34, and ST42. The CC9/CC398 hybrid is another important example, giving rise to t899 isolates which largely diverge from their original CC9 genetic backgrounds and which mediate human diseases given their arsenal of virulence factors.
Here, we characterized t899 isolates from different STs using whole-genomesequencing (WGS)-based approaches, together with epidemiological data, antimicrobial resistance genes, and virulence markers. This analysis revealed the differential occurrence of genes that can be used to further characterize t899 isolates. ST398-t899 isolates harbored the IEC cluster, which is crucial for disrupting the normal function of the human immune system (22,(32)(33)(34). Among the 34 t899 isolates tested, all ST398 representatives harbored the tet(M) gene, which is either transposon located or chromosomal, while ST9 representatives either were susceptible or carried the plasmidlocated tet(L) gene. The tetracycline resistance gene tet(M) is a common feature of LA-MRSA ST398, while it is absent from MRSA ST9 (35)(36)(37). In contrast, ST9 isolates carried staphylococcal enterotoxin (SE) genes, which were not detected in ST398 isolates. This clear discrepancy between the two lineages would be useful to refine the LA-MRSA characterization when only spa typing is used and indicates the presence of t899 isolates.
Overall, investigations into S. aureus populations using WGS would be useful for future molecular epidemiology studies and for more closely examining the global evolution of S. aureus lineages. WGS also helps to assess the performances of classical typing methods by comparison. According to David et al. (38), two genotyping methods examining distinct genetic loci will not consistently provide identical results in classifying MRSA isolates, mostly because these methods assess genetic differences that can evolve independently. Classification systems often employed for epidemiological research have created competing nomenclatures that are useful for assessing the relatedness of isolates but are unfortunately not always directly comparable. This study emphasizes that spa typing is not sufficient to characterize t899-positive LA-MRSA. Accordingly, this study suggests the usefulness of an additional genomic marker to assign t899-positive MRSA isolates to the ST9 or ST398 clone, which may include tet (M), sak, and/or seg genes. Of course, this analysis should be refined when new t899 isolates belonging to other STs are sequenced and characterized.

MATERIALS AND METHODS
Bacterial collection. Thirty-four t899 S. aureus isolates were found in the publicly available databases, and their corresponding characteristics (MLST, matrix [human, food, and animal origins], and geographical origin) were recorded. Raw reads were downloaded from NCBI, reads were quality checked with FastQC v.0.65, and low-quality reads were trimmed using Trimmomatic v.0.36.4 (39). Subsequently, contigs were generated using the SPAdes ve.3.5.0 algorithm (40), and those whose length exceeded 200 bp were retained in the assembly. In the literature, spa type t899 was also found to belong to 15 other SLVs and multilocus variants (MLVs) of ST9 and ST398 (see Table S1 in the supplemental material). These isolates could unfortunately not be included in our analysis because of the absence of associated WGS data.
cgMLST analyses. Isolates were subjected to cgMLST analyses. Genome-wide gene-by-gene microbial typing was performed using Ridom SeqSphere1 S. aureus cgMLST analysis with default parameters (41). The cgMLST data contain 1,861 coding loci representing the core genome (41). Once an allelic profile was assigned to each genome, a minimum spanning tree was constructed from the concatenated core genome sequences and visualized using the online tool PHYLOViZ. cgMLST loci with no allele calls were ignored in the pairwise comparison during the tree construction. The minimum spanning tree constructed on the basis of cgMLST data illustrates clusters by ST, spa type, or matrix (Fig. 1).
SNP analysis and phylogenetic tree. A phylogenetic tree was constructed based on single nucleotide polymorphism (SNP) analysis (9,10,22). SNPs were identified by mapping reads against the ST398 reference genome (strain S0385; GenBank accession no. AM990992). The maximum-likelihood phylogenetic tree was established in CSI Phylogeny using default settings (42). The phylogenetic tree visualization was realized using iTOL (Interactive Tree of Life) (43).
Detection of resistance genes and selected virulence markers using WGS data. The online tools ResFinder v.3.2 (44) and Virulence Finder v. 2.0 (45) from the Center for Genomic Epidemiology web- based platform were used to detect genes encoding potential resistance to antimicrobials and virulence markers, respectively. For a hit to be reported by the two programs, it had to cover at least 60% of the length of the gene sequence in the database with sequence identities of 60% and 90%, respectively. WGS-assembled data were used to perform the analysis.
Data availability. The sequence information for isolates SAV1035, SAV1149, SAV1150, SAV1158, and SAV1228 has been deposited in the SRA database under study accession number SRP161670. Individual accession numbers are listed in Table 1.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. SUPPLEMENTAL FILE 1, PDF file, 0.1 MB.

ACKNOWLEDGMENTS
We thank the service Transversal Activities in Applied Genomics from Sciensano for the paired-end sequencing reactions and for the development and maintenance of the in-house instance of the Galaxy workflow management system and Andrew D. Miller for his help in proofreading the manuscript. We thank Mirko Rossi for his valuable input in the course of designing this study and Beatriz Guerra and Antonio Rinaldi for cgMLST analysis.
This study was supported by the Ministry of Education, Youth and Sports, project no. CZ.1.05./2.1.00/19.0385. Funding sources did not affect the design of this study, data collection, data analysis, decisions on publication, or preparation of the manuscript.