Evolution and Population Dynamics of Clonal Complex 152 Community-Associated Methicillin-Resistant Staphylococcus aureus

Understanding the evolution of CA-MRSA is important in light of the increasing importance of this reservoir in the dissemination of MRSA. Here, we highlight the story of the CA-MRSA CC152 lineage using whole-genome sequencing on an international collection of CC152. We show that the evolution of this lineage is novel and that antibiotic usage may have the potential to select for the phage-encoded Panton-Valentine leukocidin. The diversity of the strains correlated highly to geography, with higher level of resistance observed among the European MRSA isolates. The mobility of the SCCmec element is mandatory for the emergence of novel MRSA lineages, and we show here distinct acquisitions, one of which is linked to the successful clone found throughout Europe today.

an international collection of CC152. We show that the evolution of this lineage is novel and that antibiotic usage may have the potential to select for the phageencoded Panton-Valentine leukocidin. The diversity of the strains correlated highly to geography, with higher level of resistance observed among the European MRSA isolates. The mobility of the SCCmec element is mandatory for the emergence of novel MRSA lineages, and we show here distinct acquisitions, one of which is linked to the successful clone found throughout Europe today.
KEYWORDS CA-MRSA, CC152, MRSA, PVL, SCCmec, antibiotic resistance, evolution, genetics, virulence, S. aureus I n the late 1990s, a remarkable change in the epidemiology of Staphylococcus aureus was observed with the emergence of methicillin-resistant S. aureus (MRSA) in the community (1,2). Today, multiple lineages of community-associated MRSA (CA-MRSA) are observed worldwide (3), and the evolutionary history of some of these has been described in detail using whole-genome sequencing on global collections of isolates (4)(5)(6)(7). As common characteristics, CA-MRSA clones generally are less resistant to antibiotics than health care-associated MRSA (HA-MRSA), harbor smaller staphylococcal cassette chromosome mec (SCCmec) type IV or V elements, and often express the Panton-Valentine leukocidin (PVL) toxin encoded by the prophage genes lukS/F-PV. However, the clinical epidemiological distinction between HA-and CA-MRSA has started to dissipate as CA-MRSA lineages, in particular USA300, have expanded their niches into hospitals (8,9). CA-MRSA clones have been observed to be geographically restricted (10), with the European CA-MRSA as one example. This lineage, belonging to clonal complex 80 (CC80), was derived from a PVL-positive methicillin-susceptible S. aureus (MSSA) ancestor from sub-Saharan Africa and became a dominant CA-MRSA clone in Europe, Northern Africa, and the Middle East in the 1990s and 2000s (5). Other well-characterized CA-MRSA lineages that also have shown intercontinental transmission include USA300, where genome analyses point to a European ancestor prior to acquisition of PVL and SCCmec type IV in the United States (6), and the multidrugresistant and PVL-positive Bengal Bay clone (MLST sequence type 772 [ST772]) that has spread globally with an origin in community-and health care-associated environments in India (4).
The different CA-MRSA lineages observed worldwide have shown different geographical origins and dissemination patterns and with key genetic acquisitions strongly associated with their expansion. The S. aureus lineage CC152 has been linked to community cases throughout Europe and in particular with strong epidemiological links to the Balkan region and encodes key CA-MRSA genetic features such as PVL and carriage of a small SCCmec type V element (3,11). CC152 PVL-positive MSSA isolates have been reported in the Canary Islands and in sub-Saharan Africa and was found as a highly dominant nasal carrier clone in Mali (11). In the present study, an international collection of CC152 MSSA and MRSA isolates was investigated to understand the origin, evolution, and dissemination of the lineage.

RESULTS
Strain collection and genome sequencing. This study included 149 CC152 S. aureus isolates (93 MSSA and 56 MRSA) sampled between 1999 to 2015 in 28 different countries in Europe, Africa, Australia, and the Caribbean Islands ( Fig. 1; see also Table S1 in the supplemental material). Of these, 31 were selected and obtained for genome sequencing based on the MRSA TypeCat database to expand the temporal and geographical diversity of the collection and 10 genomes were obtained from NCBI's RefSeq database.
The majority of the CC152 isolates belonged to ST152 (n ϭ 139; 93%); of the remaining 10 isolates (seven MSSA and three MRSA), 9 were single-locus variants of ST152 (four ST1518, one ST1831, and four different new STs), whereas another single isolate was a double-locus variant of ST152 (new ST; see Table S1). Twenty-two different spa types were identified, with t355 being the most common (n ϭ 98; 65.8%; see Table S1). The isolates were sequenced to an average depth of ϳ120-fold.
Phylogenetic relatedness and geography. A total of 5,189 single nucleotide polymorphisms (SNPs) was detected within 82% (ϳ2.19 Mb) of the detected CC152 core genome among all 149 isolates. Five different strains, which were used for rooting the tree, all revealed identical rooting and overall topology (data not shown). The rooted phylogeny was based on 5,149 SNPs after purging of eight putative recombi- Evolution and Origin of a European CA-MRSA Clone nant regions spanning between three and 15 nucleotides (see Fig. S1 in the supplemental material). An unrooted CC152 phylogeny ( Fig. 2A) showed mainly two distinct clades, one contained primarily MSSA isolates (MSSA-dominated clade, Fig. 2A), and the other contained only MRSA isolates (MRSA clade, Fig. 2A). By rooting the phylogeny (Fig. 2B) and population with geographical data, the MRSA clade was linked almost exclusively to Europe, while the other more diverse clade contained mainly isolates from sub-Saharan Africa and the Caribbean but also some with a European origin. All major clades had high posterior bootstrap values (Fig. S3).
Prevalence of the Panton-Valentine leukocidin-encoding Sa2 prophage. Among the CC152 isolates, the majority (n ϭ 139, 93%,) were PVL positive and carried lukF/S-PV genes in the Sa2 prophage. Sequence analysis of the diversity of the Sa2 PVL-prophage ( Fig. S2) among CC152 and other Sa2 positive CA-MRSA lineages clearly support a single ancestral acquisition of the Sa2 prophage in CC152. The PVL-negative isolates (n ϭ 10) were generally scattered across the tree, indicating independent loss of the prophage (Fig. 2).
Coalescent analysis of CC152. The best model for the Bayesian analysis was the general time reversible (GTR) model under the random local clock and the Bayesian Skyline prior with a Ͼ10 Bayes factor. The analysis revealed that the time to most recent common ancestor (TMRCA) of the CC152 lineage dated to ϳ1970 (95% highest posterior densities [HPDs], 1964 to 1974), and with the emergence of the MRSA clade ϳ25 years later (95% HPDs, 1993 to 1995) based on the included strain collection. The time tree shared a highly similar tree topology ( Fig. 2B) with the rooted maximumlikelihood phylogeny (Fig. S1). The Skyline plot (Fig. S4) shows a large increase in effective population size coinciding with the acquisition of the SCCmec element in the monophyletic MRSA clade. The time-based tree (Fig. 2B) further supports the hypothesis that the ancestral population of CC152 was PVL positive. All MRSA isolates were from Europe except for one isolate from Australia, regardless of the position in the phylogeny ( Fig. 2A). The majority of the MSSA isolates formed a large clade of isolates from Europe, sub-Saharan Africa, and the Caribbean. Six MSSA isolates (intermittent group, Fig. 2A) clustered between the MSSA-dominated clade and the MRSA clade with four from Europe and two from North Africa. Combined, the analyses point to an origin of CC152 lineage in North Africa or in Europe.
Resistance determinants and genetic context. Among the 149 isolates, 14 different resistance genes were detected (Table 1 and Fig. S1). The predominant resistance determinant among the 149 CC152 isolates was blaZ (n ϭ 137, 92%), whereas the aminoglycoside resistance gene aac(6=)-aph(2Љ) was almost exclusively found in the MRSA clade (n ϭ 51, 96%) but rarely found outside this clade (n ϭ 2, 2%). Analysis of the predominant SCCmec type V (5C&5C2) showed the aac(6=)-aph(2Љ) gene to be part of the J2 region of the element in 93% of these isolates, including all isolates in the MRSA clade. The aminoglycoside resistance gene aadD was only found in 19% of isolates (n ϭ 10/96) in the MRSA clade. However, dfrG conferring resistance to trimethoprim was found in 19% (n ϭ 18/96) of the MSSA isolates found to reside in two distinct monophyletic groups: one included 10 isolates where dfrG resided in the PVL-encoding prophage Sa2, whereas in the other monophyletic group of 8 isolates no prophage or other identifiable genetic element was identified in the vicinity of the resistance gene (data not shown). No isolates carried known fusA mutations, and only a single MRSA isolates carried the fusB gene linked to fusidic acid resistance. Carriage of the tetracycline resistance determinant tet(K) was found throughout the phylogeny, with 19% (n ϭ 10/49) of the isolates positive in the MRSA clade and 29% (n ϭ 28/96) among the remaining isolates, all located on a pT181 plasmid.

DISCUSSION
The S. aureus CC152 lineage has been reported as a PVL-positive CA-MRSA that has only rarely been reported outside the European continent. Conversely, PVL-positive CC152 MSSA isolates have been associated with the African continent and the Caribbean and less often in Europe (8,11). Thus, the evolution of CC152 resembles in many ways the genotypic and spatial characteristics of the European CC80 CA-MRSA clone (5). Here, we analyzed a collection covering 28 countries and spanning from the earliest reported isolates in 1999 until 2015 to understand the evolution of the CC152 lineage.
The data in this study, combining molecular analyses in conjunction with temporal and spatial information, indicate that the CA-MRSA CC152 clone originated north of the Sahara from a PVL-positive MSSA ancestor. Using both phylogenetic approaches with the available data, deep branches near the root all point to European or Northern African MSSA isolates. Specifically, a small cluster of three isolates, including two North African (Algerian) and a single European (French) isolate, clustered ancestral to the monophyletic European MRSA clade (Fig. 2B). However, since no epidemiological information about these three isolates were available and there is a cultural and geographical proximity between France and Northern Africa, our data do not allow us to infer the precise geographical origin of the MRSA ancestor. The other major clade identified in the phylogeny contained mostly MSSA isolates of sub-Saharan African and Caribbean isolates. Ruimy et al. proposed that, similarly to the European CC80 CA-MRSA clone, the CC152 lineage originated in sub-Saharan Africa and spread to Europe and subsequently acquired the SCCmec element (11). However, the results of our genomebased analyses do not support such a sub-Saharan origin.
Multiple CA-MRSA lineages emerged globally in the 1990s and are generally predominantly found in specific geographical regions (10) exemplified by the European CA-MRSA CC80 clone, USA300 (5,6,12), and now the CC152 CA-MRSA clone. In both the ST8 and the CC80 lineages, SCCmec elements were acquired on multiple occasions (6,13,14); however, successful clones, including both the North and South American variants of USA300 and the European CA-MRSA, are all associated with single acquisitions of an SCCmec element. A similar pattern is observed for the CC152 lineage, where multiple acquisitions of distinct SCCmec elements, including an SCCmec element type XIII (15), are evident, but only the acquisition of SCCmec type V (5C2&5) is associated with its expansion in Europe. Another key characteristic of the CA-MRSA lineages is the presence of PVL-encoding genes; both CC80 and CC152 have ancestral PVL acquisitions, whereas the CC8 lineage has an ancestral PVL-negative population (5, 6, 11). The finding here of dfrG, a determinant for trimethoprim resistance, in the Sa2 prophage is interesting since trimethoprim resistance is common in S. aureus from sub-Saharan Africa. This has previously been linked to the prophylactic use of trimethoprim in HIV-infected persons (16,17), and thus trimethoprim may select for carriage of such PVL-positive isolates.
In general, CA-MRSA has been categorized as carrying fewer resistance genes compared to HA-MRSA (18); however, the CC152 MRSA isolates in addition to mecA and blaZ also harbor an aminoglycoside resistance gene [aac(6=)-aph(2Љ)] found to reside in the SCCmec element for 95% (53/56) of the MRSA isolates. Since aminoglycoside antibiotics are mainly used in hospitals, (19), this may explain why the CC152 MRSA clone spread and thrived in hospitals, as well as in the community (20). Overall, a greater number of resistance genes was observed among the MRSA than among MSSA isolates, but this could be the result of sampling bias. Interestingly, only a single MRSA isolate was genotypically resistant to fusidic acid due to the carriage of fusB, and investigations into resistance-related mutations in fusA showed the absence of any such variants. This is striking compared to the prevalence and speculated importance of fusidic acid resistance observed in the European CC80 CA-MRSA clone (5).
Of the sub-Saharan cases with epidemiological data, 15% of the CC152 MSSA isolates were reported from bacteremia cases, which likely reflects its high prevalence as a carrier clone in sub-Saharan Africa rather than being an indication of increased virulence potential. The results of this study show an overall low level of resistance, restricted geographical spread, the harboring of smaller SCCmec elements, and carriage of genes encoding PVL in CC152 and hence a resemblance to other CA-MRSA clones.
In conclusion, we demonstrate that multiple SCCmec acquisitions have occurred in the CC152 lineage; however, only one is linked to the successful clone now found throughout Europe. Our analyses support a unique evolutionary path of this PVLpositive clone that most likely indicates a European or North African ancestral origin with the emergence of the CA-MRSA clone around the 1990s, similar to findings for other investigated CA-MRSA lineages.

MATERIALS AND METHODS
Bacterial isolates. A total of 149 S. aureus CC152 isolates, including 93 MSSA and 56 MRSA isolates, sampled in Australia (n ϭ 1), the Caribbean (n ϭ 24), Europe (n ϭ 73), northern Africa (n ϭ 2), and sub-Saharan Africa (n ϭ 49) between 1999 and 2015 were included in this study. To obtain representative temporal and spatial collection of CC152 genomes, isolates and publicly available genomic data were selected and obtained from national reference collections, NCBI's Reference Sequence Database (https:// www.ncbi.nlm.nih.gov/refseq/) and the MRSA TypeCat database (21) that contains a systematic review of S. aureus in all indexed literature with information on genotypic data published between 2000 and 2015 (see Table S1). Microreact (https://microreact.org) v5.112.3 was used to visualize the geographical and temporal distribution of the CC152 isolates (Fig. 1).
Ethics. All isolates and epidemiological information were obtained either based on previously published work or by local regulations that allow use for research.
Genome sequencing. In this study, we performed whole-genome sequencing on 139 CC152 isolates using a Nextera XT DNA Library Prep kit (Illumina) generated libraries to obtain pair-end data on Illumina's NextSeq and MiSeq platforms with 300-or 500-cycle kits. All generated sequence reads are deposited at the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) under study accession no. PRJEB36544. The remaining 10 genomes were downloaded from the NCBI (Table S1).
Phylogenetic analysis. SNPs were detected in the core genome using the Northern Arizona SNP pipeline (NASP [22]) by aligning Illumina reads against the 2.67-Mbp chromosome of S. aureus ST152 MSSA strain SA17_S6 (GenBank accession number CP010941). Positions with a Ͻ10-fold sequencing depth and/or Ͻ90% unambiguous base calls were removed. Putative signs of recombination, defined as if more than three consecutive SNPs in individual or clusters of isolates were observed in the resulting SNP matrix, were removed as previously described (5). Relatedness was investigated using the maximumlikelihood algorithm as implemented in PhyML v3.0 (23,24), using smart model selection with the Bayesian information criterion with 100 bootstrap replicates and with subtree-pruning and -grafting rearrangements for improved tree structures. For rooting, the chromosome of five unrelated S. aureus strains belonging to ST8, ST30, ST45, ST80, and ST398 (GenBank accession numbers CP000730, BX571856, NC_021554, CP003194, and AM990992, respectively) were used individually. The phylogenetic trees were visualized using iTOL v4.2.3 (25).
Coalescent analyses. Bayesian analyses were used to investigate the TMRCA method using BEAST v1.8.4 (26). BEAST was performed using the GTR and Hasegawa-Kishino-Yano (HKY) substitution models combined with both strict and random local clock under different coalescent tree priors (Bayesian Skyline, constant population, and exponential growth). Evaluation of the model combinations was  (27). All Markov chain Monte Carlo analyses were run twice with chain lengths of 200 million with sampling every 20,000th generation and using a burn-in set at 10%. A time tree was obtained using the maximum sum clade credibility topology with TreeAnnotator v2.5.2 from the BEAST package and visualized using iTOL. Genetic typing. Whole-genome sequences were de novo assembled using SPAdes v3.13.1 (28). Assemblies were used for in silico multilocus sequence typing (MLST) (28,29) passed through spaTyper v1.0 (30), ResFinder v2.1 (31), and SCCmecFinder v1. (32) for the determination of spa types, resistance genes, and SCCmec characterization, respectively, at the Center for Genomic Epidemiology (https://cge .cbs.dtu.dk). Furthermore, CLCbio's Genomics Workbench v9 (Qiagen, Aarhus, Denmark) was used for additional investigations of resistance genes and SCCmec characterization. Similarly, detection of point mutations related to fusidic acid resistance was performed using BLASTN searches across the collection using fusA extracted from the ST152 reference genome (GenBank accession no. CP010941). The draft genomes were analyzed for the presence of PVL-encoding genes lukF/S-PV and for intradiversity of the associated Sa2 prophage within the CC152 collection and other S. aureus clonal lineages. The full-length Sa2 prophage in the ST152 reference was identified using the PHAge Search Tool Enhanced Release (PHASTER) tool (http://phaster.ca/). The identified Sa2 prophage in ST152 reference genome was used as reference in a SNP-based analysis, including all PVL-positive CC152 isolates together with genomes of other CA-MRSA PVL-positive lineages (CC1, CC8, CC30, CC59, CC80, and CC93; GenBank accession no. NC_003923, CP000730, CP002110, NC_016928, NC_017351, and CP002114, respectively). The SNP alignment of the prophage was purged for putative recombination, and a phylogeny was constructed with PhyML v3.0 as previously described.
Data availability. All generated sequence reads are deposited at the ENA (https://www.ebi.ac.uk/ ena) under study accession no. PRJEB36544.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.

ACKNOWLEDGMENTS
We thank Alexandra Medina, Lone Ryste Hansen Kildevang, Elvira Chapka, and Mette Holm, Statens Serum Institut, Copenhagen, Denmark, for technical assistance. We thank the ESGS group of ESCMID for intellectual support of the study and the members for their interest in sharing of bacterial isolates and metadata.
Conceptualization of the study was done by S.B., A.R.L., and M.S. Data curation was performed by S.B., followed by bioinformatic analysis by S.B., M.S., P.M.S., T.B.J., and B.L. Funding was acquired by A.R.L. and F.L. and with project administration by S.B. All authors, excluding S.B., M.S., T.B.J., B.L., and P.S.A., provided resources. The project was supervised by A.R.L. and M.S. The original draft and visualizations were done by S.B. and M.S. All authors contributed to writing, followed by review and editing.