Genomic diversity of prevalent Staphylococcus epidermidis multidrug-resistant strains isolated from a Children’s Hospital in México City in an eight-years survey

Staphylococcus epidermidis is a human commensal and pathogen worldwide distributed. In this work, we surveyed for multi-resistant S. epidermidis strains in eight years at a children’s health-care unit in México City. Multidrug-resistant S. epidermidis were present in all years of the study, including resistance to methicillin, beta-lactams, fluoroquinolones, and macrolides. To understand the genetic basis of antibiotic resistance and its association with virulence and gene exchange, we sequenced the genomes of 17 S. epidermidis isolates. Whole-genome nucleotide identities between all the pairs of S. epidermidis strains were about 97% to 99%. We inferred a clonal structure and eight Multilocus Sequence Types (MLSTs) in the S. epidermidis sequenced collection. The profile of virulence includes genes involved in biofilm formation and phenol-soluble modulins (PSMs). Half of the S. epidermidis analyzed lacked the ica operon for biofilm formation. Likely, they are commensal S. epidermidis strains but multi-antibiotic resistant. Uneven distribution of insertion sequences, phages, and CRISPR-Cas immunity phage systems suggest frequent horizontal gene transfer. Rates of recombination between S. epidermidis strains were more prevalent than the mutation rate and affected the whole genome. Therefore, the multidrug resistance, independently of the pathogenic traits, might explain the persistence of specific highly adapted S. epidermidis clonal lineages in nosocomial settings.

267 methods). The analysis showed that SE strains were the most abundant (573 strains), followed by 268 S. aureus (146 strains), and other coagulase-negative Staphylococcus (CoNS) species in low 269 proportion (81 strains) (Fig. S1). The SE group was composed mainly of strains recovered from 270 blood samples and catheters (Fig. S1).

272
The number of antibiotic multi-resistant SE strains per year showed similar profiles through the 273 studied period (Fig. 1, A-H). Methicillin resistance was found in about 80% of the SE isolates. 274 SE strains multi-resistant to nine antibiotics were the most frequently found throughout the eight 275 years. In contrast, S. aureus strains showed resistance to about four antibiotics as the most 276 common profile (Fig. S2). Similarly, the resistance to each antibiotic remained without change, 277 during the eight-year study (Fig. 1, I-P). These results suggest the persistence of a multidrug-278 resistant clone or few clones of SE at the hospital during the eight years, or frequent horizontal 279 exchange of genes encoding antibiotic resistance between SE clones. To study these alternatives, 280 we choose one or more SE strains per year to totalize 17 clinical SE coagulase-negative (CoNS) 281 strains (Fig. 1). These particular strains came from nosocomial infections of 14 newborns and 282 three adults, isolated from three different infection sites: blood (8 strains), catheters (7 strains), 283 cerebrospinal fluid (1 strain), and soft tissue (1 strain) ( Table 1).  (Table S4). To assert that the 17 assemblies represent a substantial part of 290 the SE genomes, we compared the total genome length, and the number and length of the 291 predicted ORFs, with 11 complete genomes of SE downloaded from GenBank (Table S5). There 292 were no differences between the genome length of the SE INPer genomes and the complete 293 genomes from the GenBank (unpaired t-test = 2.33; p-value = 0.022), in ORFs number (unpaired 294 t-test = -1.68; p-value = 0.117), or ORFs length (unpaired t-test = 2.60; p-value = 0.014) ( Table  295 S5). Therefore, we can conclude that the SE INPer genome sequences obtained here provide a 296 broad catalog of genes per genome useful for comparative genomics. 297 298 Genomic, pangenomic, and phylogenetic relationships among SE isolates. To define the 299 genomic similarity between the 17 clinical SE strains, we performed whole-genome pairwise 300 nucleotide identity estimates (Average Nucleotide Identity by Mummer, ANIm) (Richter et al. 301 2016). The 17 SE showed high genomic ANIm values of about 99%, covering more than 90% of 302 the genome length (Fig. S3). One exception is strain S10, which showed ANIm values of about 303 97% concerning the rest of the strains. 304 305 Besides the genomic identity between SE isolates, we wanted to investigate the extent of their 306 genetic variability by performing a pangenome model. To this end, the SE collection was 307 complemented by the inclusion of 12 complete genomes of SE strains downloaded from 308 GenBank (Table S3, NCBI reference complete genomes). The model obtained using the 309 GET_HOMOLOGUES software package (Contreras-Moreira & Vinuesa 2013), indicated an 310 open pangenome (Fig. S4). The core genome component for the 29 SE strains was predicted to 311 consist of about 1575 gene clusters, whereas the sum of genes unevenly distributed in the 29 SE 312 genomes (accessory component) contains 4360 gene clusters. 313 To know the phylogenetic relationship of the SE from INPer in the context of reference SE 314 strains, we did an un-rooted ML phylogenetic tree using the predicted 1575 concatenated core 315 proteins (Fig. 2). There were three clades separated by the largest branches in the tree that 316 comprise most of the SE local strains, and one or more SE strains isolated worldwide (Fig. 2,  317 clades B, C, D). The clade marked as D in the tree consisted of two groups, one of which 318 includes only reference SE strains, whereas the other had most of the SE strains of the collections 319 studied. Strain S10, the most different strain by ANIm in the SE collection, was grouped in the 320 clade A with the commensal strains SE ATCC12228 (2) and SE 14.1.R1 isolated from USA and 321 Denmark.

323
The distribution of accessory genes in individual SE strains coincided with the phylogenetic 324 clades. The SE strains grouped in three phylogenetic clades are closely related by their similitude 325 in the gene presence/absence profile (Fig. 3). Despite the high identity of the SE strains, there is 326 still considerable individual variation that may account for adaptations to local milieu.  Thomas et al. 2007). 332 The analysis showed a total of 8 different STs; seven of them already recorded in the database. 333 The S10 strain had an unassigned ST in the database and only differed by a single amino acid 334 substitution in the YqiL protein (L370C). The ST2, ST5, and ST23 are worldwide distributed 335 and are the most represented in our sample (Lee et al. 2018;Miragaia et al. 2007). In agreement 336 with global data, ST35, ST59, ST81, ST89 are less frequently represented. The clonal 337 relationships among STs determined by eBURST, indicate that founder clones are ST2 and ST5, 338 whereas the other four STs (ST 59, 81, 89 and 23), are peripheral clones mostly related to each 339 than to the primary founder clones (Fig. S5). 340 341 Virulence genes. Several known virulence genes of SE are shared by commensal and pathogenic 342 strains (Otto 2009). Among them, we found the cluster icaADBCR (biofilm formation) in nine 343 out of the 17 SE genomes analyzed, except for icaC, absent in strain S05 ( Table 1). The phenol-344 soluble modulins (PSMs), involved in inflammatory response and lysis of leukocytes were 345 present in the genomes of all SE strains (Peschel & Otto 2013). The novel ESAT 6 (esaAB, 346 essABC, and esxAB genes) secretion system implicated in immune system evasion and 347 neutrophil elimination was only present in S10 strain (Burts et   Frequently, SE isolates contain the genomic island ACME, composed by arginine deaminase 351 catabolic (arg) and an oligopeptide permease operons (opp) (Miragaia et al. 2009). ACME has 352 been associated with successful adaptation to the skin and mucosal surfaces outcompeting other 353 related bacteria (Planet et al. 2013). The diversity of ACME has been organized in six distinct 354 types according to their gene composition (Table S2 and references therein). We looked for the 355 presence of ACME genes in the 17 SE genomes of INPer collection. By means of BlastP 356 comparisons we identified genes belonging to the ACME Type I (arg + opp + ) in genomes of SE 357 strains S03, S17, and S21, and the ACME Type V that contain the arg + opp +, and ars + (arsenate 358 resistance operon), and kpd+, a potassium transporting operon in S24 (Centers for Disease  (Table S2). Even though some other SE 360 strains present genes of the arg and ars operon, they lacked the genes of the opp operon; 361 therefore, they were not assigned to ACME known types. 362 363 Antibiotic resistance genotype and phenotype. The SE strains were tested for their 364 susceptibility to methicillin and other -lactams antibiotics as described in methods. All the 17 365 SE strains have the -lactamase gene (blaZ) and their regulators (blaR and I), which are probably 366 responsible for the broad resistance spectrum to penicillin, carbapenems, and cephalosporins 367 determined by the VITEK ® 2 system (Table 2). Methicillin resistance was evaluated by the disc 368 diffusion method for cefoxitin and corroborated by the presence of mecA in the genomes (Table  369 1 and Table 2). 12 out of 17 SE strains showed agreement between the cefoxitin resistance 370 phenotype and the mecA genotype. Even thought strains S03, S05, S14, S15, and S21, have the 371 mecA gene in the genome, they were scored as susceptible for cefoxitin (disc diffusion ≥ 25 372 millimeters). To support that the 15 mecA positive strains are Methicillin-Resistance SE (MRSE) 373 strains, we evaluated their resistance to oxacillin. Commonly, the Methicillin-Resistance S. 374 aureus (MRSA) strains are evaluated by the resistance to oxacillin (Centers for Disease Control 375 and Prevention 2019). In the SE collection, only the S07 strain was susceptible to oxacillin (6 376 g/ml) in plate assays, despite the detection of mecA in the genome. Altogether, these results 377 indicate that most of the SE strains (15) studied here are MRSE, identified by the presence of the 378 mecA gene; any phenotypic inconsistencies could be due to lack of expression of the mecA, 379 heteroresistance, or the limitation of the current antibiotic susceptibility testing methods (Band et (Table 1; Fig. S6). The analysis demonstrated the 388 presence of the community-acquired SCCmec type IV cassette in 13 out of 14 methicillin-389 resistant strains. The SE INPer strains S10 and S16 lack the SCCmec cassette, and no mecA gene 390 was detected. Although the S21 strain has a mecA gene, we were unable to find other gene 391 elements to show the presence of a mec cassette. Moreover, strains S07, S09, S13 strains contain 392 an additional SCCmec type VIII cassette in tandem with the SCCmec IV cassette. The S07 strain 393 carries a contig of 36 535 bps of the SCCmec type IV and VIII, suggesting the probable structure 394 of the recombined cassette (Fig. S7).

396
The genomic analysis revealed that some SE strains studied here included genes for resistance to 397 fluoroquinolones, macrolides, sulfonamides, aminoglycosides, tetracycline, and other antibiotics 398 not used as the first choice in clinical therapy ( Table 2). The results given in Table 2 corroborate 399 that in most of the cases, the probable gene responsible for the resistance is present in the 400 genomes. Besides, non-synonymous mutations in the antibiotic target proteins GyrA and RpoB 401 were identified in some SE strains resistant to quinolones and rifampicin. Despite other SE 402 strains lack these mutations, they still were resistant to these antibiotics. Then, other mutations in 403 the antibiotic target protein or other genetic mechanisms not yet known would be responsible for 404 these resistances.
405 Mobile genetic elements. The genomic variability observed in the SE strains suggests active 406 processes of recombination and gene exchange. To study this concern, we first look for 407 prophages and CRISPR-Cas related systems in the genomes. Prophage footprints were found in 408 10 out of 17 genomes of the SE strains. The most significant prophage hits detected by the 409 PHAST program, were for genomic regions spanning about 28 to 95 kb that include an 410 attachment site, a signature of lysogenic phages (Arndt et al. 2017) (Table S6). In this analysis, 411 prophages were found integrated into the genomes of some SE strains, such as CNPH82 found in 412 the SE strains S14, S17, and S18 (Daniel et al. 2007). Some other prophages such as StB20 and 413 SpBeta were present in S03 and S16 strains respectively and, the prophages IPLA5 in strain S07 414 and IPLA7 in S12 and S15 strains (Gutierrez et al. 2012). In the remaining SE strains, prophages 415 sequences were not detected. 416 417 SE strains have also acquired defense mechanisms against phage infection. The search for 418 CRISPR-Cas immune systems identified nine out of the 17 SE INPer strains carrying a CRISPR-419 Cas Type III system. The system III is composed by the Cas1 and Cas2, responsible for spacer 420 processing and insertion; the ribonuclease Cas6, and the cascade proteins Csm1 to Csm6, 421 involved in the processing of the target transcript (Table S7). The CRISPR-Cas Type III system 422 has been already reported to confer immunity to phages as well as to conjugative plasmids in   (Table S8). The presence of IS256 432 has been found within pathogenic SE, associated with biofilm formation and virulence 433 (Kozitskaya et al. 2004;Murugesan et al. 2018). Among the SE strains of our collection, the 434 isolates having the IS256 also contain the ica operon, confirming previous observations. 435 Exceptionally, only the strain S21 has the ica genes but lacks IS256.  To determine whether or not the recombination estimates were affected by the sample 449 composition of SE strains, we design several control tests, with distinct groups of genomes. First, 450 we discarded the most divergent S10 strain of the SE collection and ran the ClonalFrameML test 451 only with the 16 SE most related SE strains of the collection. As shown in Figure 4, the r/m rate 452 for this set reduced to a median of four, and this value is not significantly different respect to the 453 r/m of SE of the complete genome of SE strains obtained from the GenBank (z-score = 1.4, P-454 value 0.135) (Fig. 4, boxplot 25). Second, we computed the r/m rate in eight different sets 455 randomly selected among 260 complete and draft genomes of SE strains from the GenBank (Fig.  456 4, 26-33; Table S3). Some sets (Ctr2 and 8) display the lowest r/m values, whereas the rest 457 control sets have r/m upper than two up to four. These results indicate that the strains sample 458 composition influence the recombination estimates.

460
The RaxML nucleotide phylogenetic tree used as a reference to estimate recombination looks 461 similar to the core protein phylogeny presented in Fig. 2; SE strains within clades maintain 462 cohesive relationships (Fig. 5). However, multiple recombination events were detected in the 463 ancestral nodes leading to the SE strains. The most prominent branch (red dot line in Fig. 5) 464 divided the SE strains into two large clades, one including the clade D and the other constituted 465 by clade B and C. Within different divergent lineages recombination introduces much more 466 nucleotide variants than mutation as presented before (r/m). These results indicate that local 467 hospital settings SE strains may contain enough genomic diversity despite their close relationship 468 with the main clonal ST complexes of worldwide distribution. In this work, we conducted a survey 476 at the Instituto Nacional de Perinatología "INPer" in México City, over a period spanning eight 477 years, to register changes in the antibiotic profile of staphylococci species and strains. We focus 478 on SE because it was the bacteria most frequently found among the staphylococci isolates and 479 showed a remarkable multi-resistance pattern. The number of antibiotic resistances remained 480 very similar year by year with most SE strains being multi-resistant to nine antibiotics; a minor 481 representation of low-resistant strains (< 5 antibiotics) was found. Therefore, the SE population 482 within the INPer hospital was highly stable and led us to question its genetic composition. 483 Specifically, we address the hypothesis that a single or few clones are the basis of the 484 normalization of all the SE multi-resistant strains in the children´s health care unit studied. . Likely, these are commensal SE strains that invaded the patients in the course of their 495 hospital stay even though we cannot discard they use other pathogenicity mechanisms. These 496 strains showed resistance to multiple antibiotics, and three of them contain a composite SCCmec 497 cassette formed by type IV and VIII gene elements. We propose a probable structure of the 498 combined SCCmec IV and VIII cassette ( Figure S7). It is an unusual combination of SCC 499 elements, but there are reports in the literature of mosaic chromosomal staphylococcal mec 500 cassettes (Heusser et al. 2007). However, this SCCmec genetic element should be subject to 501 further corroboration. Classification of SCCmec is still hard to discern due to the variability and 502 presence of repeated elements which difficult the correct assembly of the region (International 503 Working Group on the Classification of Staphylococcal Cassette Chromosome 2009). The 504 chromosomal cassettes might be hot spots of recombination of heavy metal resistance genes, 505 insertion sequences, and antibiotic resistance genes recruited by HGT (Xue et al. 2017).

507
The genomic antibiotic resistance spectrum of SE strains is very diverse, with some genes 508 present in most strains and others only in few. Examples of diversification of the resistance 509 mechanisms are the presence of membrane efflux pumps (NorAB), which may confer resistance 510 to quinolones as well, and several modifying enzymes (AAC, APH) that inactivate 511 aminoglycosides such as kanamycin. Indeed, the fosB gene, which encodes the resistance to 512 fosfomycin was found in 9 out of 17 SE strains. In several of these instances, the resistance 513 phenotype coincided with the presence of one or more genes, encoding modifying or degrading 514 enzymes and mutant protein targets for some antibiotics (Table 2).

516
The genome analysis also indicates some mechanisms for antibiotic resistance, including non-517 synonymous substitutions in the housekeeping genes gyrA and rpoB. In the gyrase (gyrA) it was 518 found the amino acid S84F change which confers quinolone resistance, whereas, in rpoB (-519 subunit of the RNA polymerase), a double amino acid substitution D471E: I527M, and a single 520 I527M were identified. The double mutant rpoB D471E: I527M has been recognized elsewhere 521 as the most common cause of worldwide rifampicin resistance (Lee et al., 2018). Indeed, the 522 presence of this rpoB variant reduces the susceptibility to vancomycin and teicoplanin. In this 523 work, the rpoB double mutant was detected in S02, S05, and S08 all belonging to ST23 clonal 524 type, while the single mutant I527M was observed in the strains S14, S17, S18, and S19, which 525 are within the ST2 clonal type. Both are the clonal lineages worldwide distributed reported by 526 Lee et al., (2019). These INPer strains can form biofilms and contain most of the virulence 527 determinants (Table 1). Therefore, they are very adapted SE strains and continuously present in 528 the INPer.  (Meric et al. 2018). These Genome-wide-association (GWAS) studies showed the enrichment of 535 several genes involved with methicillin resistance, biofilm formation, cell toxicity, and 536 inflammatory response in pathogenic SE isolates (Meric et al. 2018). Therefore, pathogenic and 537 commensal SE strains likely coexist in the same infection site, but current clinical methods of 538 isolation prevent us from distinguishing one from the other. Eight icaout of 17 strains analyzed, 539 could not form biofilms even thought they were recovered from ill patients (Table 1). 540 541 Furthermore, here are various footprints of MGEs, including prophages, ISs, and the phage 542 immunity CRISPR-Cas systems in the SE genomes that likely contribute to the adaptability by 543 the acquisition of virulence and antibiotic resistance factors. As expected, due to its mobile 544 nature, these elements do not follow a uniform distribution in the phylogeny, indicating frequent 545 genetic exchange in the SE population. Together with HGT, homologous recombination may be 546 a factor for genetic diversification of SE in hospital settings. In our work, extensive genome 547 analysis of the rates of recombination versus mutation suggests that recombination affects the 548 whole genome and not only a particular class of genes. It has been estimated that recombination 549 could involve 40% of the genome of SE, whereas in S. aureus recombination comprises the 24% 550 portion (Meric et al., 2015). Although recombination rates depend strongly on the sample of 551 strains used for the analysis, the estimated r/m values reported agrees with other recombination 552 test performed with distinct samples of SE, few or whole-genome markers as well as reported 553 r/m numbers in the literature (Meric et al. 2015;Miragaia et al. 2005). Therefore we can 554 conclude that the SE population despite its whole low level of nucleotide variation (ANIm > 555 97%) shows cohesive clonal behavior but frequent gene exchange and recombination. 556 Conclusions 557 At local hospital settings, pathogenic, and commensal SE strains coexist, but it is hard to discern 558 if they are contaminants, commensal colonizers, or virulent strains (Widerstrom et al. 2016). 559 Indeed, single-colony testing for the identification of SE isolates limits to know the extent of 560 multiclonal or multi-species infections (Harris et al. 2016;Van Eldere et al. 2000). In the present 561 study, some analyzed SE strains came from nosocomial patients, but lack ica genes, a classical 562 virulence marker. However, we cannot exclude that these putative commensal SE strains are 563 pathogens in a non-determined manner, or in other conditions. Likely they are part of the intra-564 hospital non-pathogenic microbiome. These presumed SE commensal strains, as well as the 565 biofilm formers considered pathogenic SE strains are multi-resistant to antibiotics. The results 566 present here of an 8-year survey, suggest that the multi-resistance to antibiotics might drive 567 adaptation and persistence of specific SE clones in hospital settings. HGT and recombination 568 might play a crucial role in the origin of the pathogenic clones, by moving and recombining 569 antibiotic resistance and virulence genes in distinct genomic clonal backgrounds, including non-570 pathogenic strains. Therefore, the clinical and genetic factors that influence the adaptability 571 stability and change of SE community overtime should be addressed in detail in future studies. 572 Acknowledgments 573 Authors recognized the Instituto Nacional de Perinatología (INPer) for providing the S. 574 epidermidis strains used in this work. Thanks to Gabriela Guerrero, Luis Lozano and José 575 Espíritu for computational help. We express thanks to Olga M. Pérez-Carrascal for her advice on 576 the use of ClonalFrameML, and to Santiago Castillo for discussion on recombination analysis. 577 We appreciate the valuable work of the editor and reviewers to improve the successive versions 578 of the manuscript.     Antibiotic resistance phenotypes and the genotype profile in S. epidermidis strains