Staphylococcal Phages Adapt to New Hosts by Extensive Attachment Site Variability

ABSTRACT Bacterial pathogens commonly carry prophages that express virulence factors, and human strains of Staphylococcus aureus carry Sa3int phages, which promote immune evasion. Recently, however, these phages have been found in livestock-associated, methicillin-resistant S. aureus (LA-MRSA). This is surprising, as LA-MRSA strains contain a mutated primary bacterial integration site, which likely explains why the rare integration events that do occur mostly happen at alternative locations. Using deep sequencing, we show that after initial integration at secondary sites, Sa3int phages adapt through nucleotide changes in their attachment sequences to increase homology with alternative bacterial attachment sites. Importantly, this homology significantly enhances integrations in new rounds of infections. We propose that promiscuity of the phage-encoded tyrosine recombinase is responsible for establishment of Sa3int phages in LA-MRSA. Our results demonstrate that phages can adopt extensive population heterogeneity, leading to establishment in strains lacking bona fide integration sites. Ultimately, their presence may increase virulence and zoonotic potential of pathogens with major implications for human health.

appear to be as severe as those caused by human-associated strains (8). Although human infections with LA-MRSA are considered to be the result of spillovers from livestock, there have been examples of transmissions between household members as well as into community and health care settings (2,3,7). Importantly, such transfer events were associated with LA-MRSA strains carrying prophages of the Sa3int family (2,3,7,9). As 95% of tested Danish pig herds are positive for LA-MRSA (DANMAP, 2019), establishment of Sa3int phages in these strains may pose an increased risk of community spread of LA-MRSA strains.
Integration of Sa3int phages in S. aureus occurs through orientation-specific recombination between identical 14-bp phage and bacterial core attachment sequences (attP and attB, respectively) and is mediated by a phage-encoded tyrosine recombinase, the integrase Int (10,11). In livestock strains, the sequence corresponding to attB has two nucleotide changes (underlined): 59-TGTATCCGAATTGG-39 (attB LA ). These substitutions do not alter the amino acid sequence of the b-hemolysin encoded by hlb in which attB is located but significantly decrease the ability of Sa3int phages to insert at this location by approximately 2 log (12). Accordingly, in LA-MRSA strains, Sa3int prophages are mostly located at alternative integration sites with variable positions in the bacterial genome but occasionally also in attB LA (2,(12)(13)(14)(15).
S. aureus has on several occasions demonstrated its ability to alter its preference for human or animal hosts. In general, such "host jumps" are thought to occur when infections of less preferred hosts are followed by host adaptation, ultimately leading to colonization (2,16). Host adaptation often involves acquisition or loss of mobile genetic elements, including prophages (1). However, little is known of the molecular events involved in the process. Using massive parallel sequencing, we examined the fate of Sa3int phages interacting with a S. aureus strain carrying the attB LA of LA-MRSA. We found that initial, rare integration events at alternative integration sites located across the bacterial genome led to phage populations with highly variable attP sequences, of which a greater part increased resemblance to the bacterial attachment sequence. Importantly, infections of naive strains carrying the attB LA site with such phage pools resulted in increased phage integration. Our results explain how Sa3int phages, by adapting their attP sequence to alternative integration sites in the LA-MRSA genome, can establish in these strains that ultimately may be more successful at colonizing and infecting humans and disseminate in the human population.

RESULTS
Sa3int phages are adapting to alternative attB sites of LA-MRSA CC398. In a recent study, 20 LA-MRSA CC398 strains from pigs and humans in Denmark were isolated and found to contain Sa3int prophages. In these strains, the prophages were located at one of five different genomic locations (variants I to VI) (2), and the respective variants were isolated from the same household and are epidemiologically related. The 14-bp primary bacterial integration site in hlb carried two nucleotide mismatches (designated attB LA ) compared to the one found in human strains in other studies of LA-MRSA strains (12,14,15). We determined the sequences flanking the prophage (attL and attR) in the LA-MRSA CC398 genomes, and through comparisons with strains that lack the prophage, we deduced the corresponding attB sequences (Fig. 1). In all cases except one (variant V), the attL sequences differed from attR. This indicates nonmatching attB and attP sites, as otherwise attR and attL would be identical, as seen with the original attB site in hlb of S. aureus 8325-4. Searching a 300-bp area flanking the alternative attB site did not reveal any conserved motifs.
To examine if mismatches between attL and attR affected excision of the prophage, we induced the lysogens with mitomycin and observed that in all strains the phages could be excised. From the resulting phages, we determined the attP sequences using PCR amplification and Sanger sequencing (Fig. 1). For eight phages (one isolate each of variants II, IV, and V and five isolates of variant VI), the attP sequences were identical to that of the model Sa3int phage f 13 (10), showing that in these cases integration in the variant attB sites did not affect the attP sequence of the excised phage. In the remaining 12 phages, however, mutations had arisen in the phage attP sequences. Importantly, in all cases, the changes increased the sequence similarity between attP and the alternative attB site of the livestock-associated strains, as indicated in Fig. 1. These results suggest that Sa3int phages may be promiscuous with respect to both integration and excision and that integration of prophages at alternative bacterial attachment sites may alter the phage in such a way that its attP sequence bares greater resemblance to alternative attB sequences.
Phage integration at multiple locations in a model strain carrying attB LA . With the aim of investigating how phage heterogeneity arises we employed a derivative of S. aureus NCTC8325-4, designated S. aureus 8325-4attBmut, which contains 2-bp point mutations in hlb to create the attB LA of the LA-MRSA CC398 lineage (12). With this strain, we performed liquid infection with f 13kan R , a derivative of the Sa3int phage f 13 that encodes staphylokinase (sak) but in which the immune evasion virulence genes scn and chp are replaced by the kanamycin resistance cassette aphA3 (12).
From eight independent lysogenization experiments, we selected 22 lysogens as being resistant to kanamycin. Alternative integration sites were confirmed for 20 of the lysogens by PCR (hlb 1 sak 1 ), and two lysogens harbored the phage in the mutated hlb site (hlb 2 sak 1 ) (Fig. S1). The 22 isolates were whole-genome sequenced, and analysis revealed 17 different integration sites for f 13kan R in S. aureus 8325-4attBmut that were widely distributed across the bacterial chromosome (Fig. S2) and with the attB sequences listed in Fig. 2. The integrations occurred in both noncoding and coding regions and were independent of transcriptional orientation.
When the 14-bp sequences of all alternative attB sites were compared (Fig. 2), they showed 29 to 86% homology compared to the original attB core sequence in the hlb gene. However, the last three base pairs (59-TGG-39) were highly conserved, being present in 20 of 22 attB sites, with lysogens 6 and 20 being the exceptions. The nucleotides G at position 8 and T at position 11, signifying attB LA compared to attB, were not found in the same combination in any of the 17 attB sequences. Based on the conserved base pairs between the alternative attB sites, we searched the chromosome of S. aureus NCTC8325 for the presence of 59-NNNNNNCWNNCTGG-39 (where W = A or T) and obtained more than 700 hits. Thus, there appears to be a multitude of potential integration sites in the staphylococcal genome.
Three of the alternative attB locations were observed as integration sites in lysogens obtained in independent lysogenization rounds, i.e., the SAOUHSC_01067 coding sequence (CDS) conserved hypothetical protein (lysogens 1, 14, and 18), the intergenic region between open reading frames encoding the hypothetical proteins SAOUHSC_01301 and Phage Adaptation ® SAOUHSC_01304 (lysogens 5 and 13), and the SAOUHSC_00125 cap5L protein/glycosyltransferase (lysogens 10 and 21). As clonality can be excluded, these integration events show that there is some preference in selection of integration site when the bona fide attB sequence is mutated. However, when we screened the 300-bp flanking regions of the alternative attB sites in S. aureus 8325-4attBut, we found no common patterns in terms of sequence composition or distance of inverted repeats relative to the alternative attB core sequences ( Fig. S3 and S4). Thus, it is still unclear why some integration sites are preferred over others.
Phage evolution following excision from alternative integration sites. In agreement with our observations for Sa3int phages in livestock-associated strains, we found that mitomycin C induced f 13kan R from all lysogens established in the 8325-4attBmut strain with the number of phage particles varying between 5 Â 10 3 PFU/mL and 4 Â 10 6 PFU/mL (Fig. S5). This represents up to a 1,000-fold decrease in induction efficacy compared to the 6 Â 10 6 PFU/mL obtained when the phage was induced from its integration site in the nonmutated attB of S. aureus 8325-4 (8325-4phi13kan R control). Spontaneous phage release was also detected for many of the lysogens, ranging from 2 Â 10 1 to 3 Â 10 3 PFU/mL, compared to 1.0 Â 10 4 PFU/mL for the 8325-4phi13kan R control (Fig. S5).
To examine the integration and excision process of f 13kan R at the alternative integration sites, we determined the attL and attR sequences from the genome sequences of the lysogens and deduced the alternative attB sites by comparing with sequences prior to integration of the phage. In addition, we determined the attP sequences by induction of the lysogens and amplicon sequencing of PCR products obtained on phage lysate with primers spanning attP (sequencing depth range, 10,000 to 180,000; average, 100,000).
For the majority of the lysogens (Fig. 3a), attL was identical to attB, and attR was identical to attP, as can be observed by the pattern of letters (representing nonmatching nucleotides) or dots (representing conserved nucleotides). For these lysogens, the integration crossover likely occurred at the 59-TGG-39 (Fig. 4a). For the remaining lysogens (Fig. 3b), both attL and attR displayed sequences matching the alternative attB site, with attL matching the 59 end and attR the 39 end. In these cases, the integration crossover events may have occurred at variable positions within the core sequences (Fig. 4b).
When assessing attP by amplicon sequencing, we observed remarkable sequence variation at single nucleotide positions in more than 40% of the phage populations obtained from 9 of the lysogens (Fig. 5). When comparing these changes to the sequence of the bacterial integration site from which the phage was derived, we saw that in five instances (lysogens 3, 10, 12, 17, and 21), the excised phages displayed adaptation to the alternative attB site by adopting a nucleotide of the alternative attB sequence (Fig. 5). Phages from lysogens 6, 7, 15, and 23 also displayed single nucleotide substitutions in attP but without matching the alternative attB sequences. These may result from mismatch repair or DNA replication after prophage excision, as has been suggested for E. coli phage P1 (17).
The adaptability of the phage to the alternative integration sites was even more pronounced when all sequence variation of .1% was scored (Fig. 5). Importantly, most of the Phage Adaptation ® excised phage pools contained variants with sequence changes adopting the nucleotides of the alternative attB sequences, and multiple sequence variations occurred within the individual pools (Fig. 5, green). Notable exceptions were lysogens 1, 14, and 18, for which no variants at .1% were observed. In these lysogens, f 13kan R had independently integrated in the same attB site, and despite 7 mismatches with the 14-bp attB sequence from 8325-4, resolution to the original attP sequence occurred with the same precision as seen when f 13kan R was excised from attB of 8325-4phi13kan R . In summary, our results demonstrate that excision of f 13kan R from alternative integration sites leads to evolutionary adaptation of the phage to the bacterium by increasing the number of attP nucleotides matching the alternative attB sequences.   Phage adaptation to alternative attB sites. After observing that induction of phages at alternative integration sites led to mutated phage populations with increased base pair matches between attP and the alternative attB sites or attB LA , we wondered whether these phages, in comparison to the original f 13kan R , had increased preference for such sites in a new infection cycle. To address this, we quantified integration by qPCR with primer pairs covering attR. We examined phage pools obtained from lysogen 2 and 7 (designated f lys2 and f lys7) excised from attB LA and compared them to the original f 13kan R with respect to integration in either 8325-4 or 8325-4attBmut (Fig. 6). As expected, we found that for the wild-type, homogeneous f 13kan R , there was much less integration in attB LA than attB that matches the attP sequence. In contrast, this difference was essentially eliminated for the f lys2 and f lys7 phage pools. The still rather high integration frequency at the original attB is probably because the phage pool likely contains phages with the original attP sequence, which continues to integrate at attB. Further, the mutations in these pools significantly increased the integration frequency in 8325-4attBmut compared to f 13kan R with the original attP site. Our results show that a single round of integration and excision dramatically increases the preference of the phage for an alternative or mutated attachment site.

DISCUSSION
Sa3int prophages encode immune evasion factors and are found in most human strains of S. aureus (18,19). In contrast, LA-MRSA commonly lacks Sa3int phages (5), but when present, they increase the risk of transmission between household members and the community (2, 3). The primary integration site for Sa3int phages is naturally mutated in livestock-associated strains, and so integration is infrequent and occurs at alternative sites (12-15). Our whole-genome analysis of S. aureus lysogens with f 13 integrated at alternative sites showed that recombination between nonmatching attB Phage Adaptation ® and attP sites leads to mismatches between the attL and attR sequences. Intriguingly, induction of these lysogens resulted in phage populations that were heterogeneous with respect to their attP sequences and that had changes that increased identity to the alternative bacterial integration sites ( Fig. 3 and 5). Importantly, we could show that in two cases the nucleotide changes in attP increased phage integration into the naive 8325-4attBmut strain in a new round of infection (Fig. 6). As Sa3int prophages are spontaneously released from alternative integration sites, environmental stimuli are not necessary for dissemination of the phages. Thus, rounds of excision and integration can take place with the potential for adaptation of attP in each round.
When examining Sa3int prophages from outbreak strains of LA-MRSA (2), we observed a greater number of adaptive changes in the attP sites of the excised phages than in our model 8325-4attBmut strain. This suggests that adapted phages have been circulating in the LA-MRSA CC398 population. This notion is supported by a study of the Sa3int phage P282 from an S. aureus CC398 strain, where the attP sequence can be deduced to be identical to attB LA (14), although this was not noted by the authors. Also, reanalysis of genome sequence data of Sa3int-prophages in MRSA CC398 isolates from hospital patients in Germany (15) revealed that in 10 of 15 lysogens, the attL and attR sequences were identical to attB LA (Table S3), indicating that the prophages have adapted to the livestock-associated strains. This raises the question of where these phage adaptations occur. In the farm environment, humans are exposed to LA-MRSA on a continuous basis, and as about one in three humans is naturally colonized with S. aureus strains containing Sa3int phages, the livestock-associated strains are exposed to the phage. Once established as a prophage in a LA-MRSA, Sa3int phages are released and, if adapted, will integrate more effectively than the original phage into the LA-MRSA population. This in turn will lead to increased transmission from human to human and potentially be the cause of severe and difficult-to-treat infections.
Integration at secondary sites has been observed for phages other than Sa3int phages when the primary integration site is absent or mutated (20)(21)(22)(23). Excision of phage l from such a site resulted in substitutions in attP (24,25), and the authors stated that in P2, the new attP region contained DNA from attR (26,27), but neither study showed increased integration in a new infection cycle. Similar to f 13, these phages encode tyrosine recombinases (11,22). This family of recombinases catalyzes recombination between substrates with limited sequence identity (28). We propose that the adaptive behavior of Sa3int phages is dependent on this promiscuity. As tyrosine-type recombinases are employed by a number of staphylococcal phages that encode virulence factors (29), our results may provide a more general explanation for how phages adapt to new bacterial strains and thereby enable the host jumps that are regularly observed for S. aureus (1).
In summary, we have shown that rapid adaptation of S. aureus prophages to alternative integrations sites is mediated through nucleotide changes of the phage attP site and that excision from alternative sites leads to extensive variety in the phage pool. This facilitates phage integration in LA strains where the preferred attB site is absent. We suspect that the promiscuity of the phage-encoded tyrosine recombinase is responsible for this evolutionary mechanism and expect further research in this field to reveal this behavior also for other tyrosine recombinases.

MATERIALS AND METHODS
Strains and media. Phage-cured S. aureus 8325-4 (30) and its mutant 8325-4f 13attBmut (12) (here termed 8325-4attBmut) containing the 2-bp variation in hlb were used as recipients and indicator strains for f 13kan R . Twenty LA S. aureus strains harboring Sa3int phages were analyzed for their attR and attL composition (2). S. aureus S0385 (GenBank accession no. NC_017333) was used as a reference strain for analysis of sequencing data of the LA strains. The prophage f 13kan R carries the kanamycin resistance cassette aphA3, which replaces the virulence genes scn and chp and was obtained by induction of 8325-4phi13kan R (12). A full strain list is provided in Table S1. Strains were grown in tryptone soy broth (TSB) (CM0876; Oxoid) and tryptone soy agar (TSA) (CM0131; Oxoid). Top agar for the overlay assays was 0.2 mL TSA/mL TSB. Kanamycin (30 mg/mL) and sheep blood agar (5%) were used to select for lysogens.
Lysogenization assay. To obtain the phage stock, 8325-4phi13kan R was grown to late exponential phase (37°C, 200 rpm; optical density at 600 nm [OD 600 ] = 0.8), mixed with 2 mL/mL mitomycin C, and incubated for another 2 to 4 h. Phages were harvested by centrifugation for 5 min at 8,150 Â g and filtering the supernatant with a 0.2-mm membrane filter. The lysogens were obtained as described previously, with slight adjustments (31). In brief, f 13kan R was added at a multiplicity of infection (MOI) of 1 to the respective recipients and incubated 30 min on ice to allow phage attachment. The nonattached phages were washed off, and after another incubation for 30 min at 37°C to allow phage infection, the culture was diluted and plated on TSA with 5% blood and 30 mg/mL kanamycin. After overnight incubation at 37°C, 20 colonies showing beta-hemolysis and two colonies without beta-hemolysis were isolated and used for further analysis. Lysogens were derived from eight independent lysogenization experiments resulting in lysogens 1 to 5 (experiment 1), 6 and 7 (experiment 2), 8 (experiment 3), 10 and 11 (experiment 4), 12 and 13 (experiment 5), 14 and 15 (experiment 6), 16 to 19 (experiment 7), and 20 to 23 (experiment 8).
Spot assay and phage propagation. Phage lysates were serially diluted in SM-buffer (100 mM NaCl, 50 mM Tris [pH 7.8], 1 mM MgSO 4 , 4 mM CaCl 2 ) and spotted on a recipient lawn of S. aureus 8325-4 for PFU determination. To obtain an even lawn, 100 mL of fresh culture (OD = 1) was added to 3 mL top agar and poured on a TSA plate supplemented with 10 mM CaCl 2 . After solidifying of the top agar, three drops of 10 mL each of each dilution were spotted on the lawn.
Induction assay. To determine the different levels of phage release, the 8325-4attBmut lysogens were grown to an OD 600 of 0.8 and centrifuged after addition of 2 mg/mL mitomycin C and further incubation for 2 h. The sterile-filtered supernatant was diluted and spotted on an overlay of 8325-4 consisting of 100 mL culture mixed with 3 mL top agar.
Whole-genome sequencing and bioinformatics analysis. Genomic DNA was extracted by using a DNeasy blood and tissue kit (Qiagen), and whole-genome sequences were obtained by 251-bp pairedend sequencing (MiSeq; Illumina) as described previously (32). Genomes were assembled using SPAdes (33). Geneious Prime 2020.1.1 was used to determine phage integration sites. The locations and core sequences were determined by extracting short sequences from the assembled draft genomes of the lysogens lying adjacent to the prophage and mapping it to the annotated genome of S. aureus 8325 (GenBank accession no. NC_007795). Reads obtained by sequencing the PCR amplicons spanning attP were mapped to the f 13 reference genome (GenBank accession no. NC_004617), and single nucleotide polymorphisms (SNPs) were called by applying a variant frequency threshold of 50%. WebLogo3 was applied to detect gapped motifs in the flanking regions of the alternative attB sites (34).
PCR and amplicon sequencing. Direct colony PCR was used to determine (i) the presence of the phage using sak primers, (ii) the integrity of the hlb gene using hlb primers, and (iii) attP using attPst primers (35) if the phage had spontaneously excised and was present in its circular form. Primer sequences and cycling conditions are listed in Table S2. For each reaction, a well-isolated colony was picked, suspended in 50 mL MilliQ water, heat lysed for 5 min at 99°C, and briefly centrifuged. One microliter was used as the template. To determine attP of induced phages in lysates, 1 mL of a 1:10 dilution of phage lysate was used as the template. Each single-reaction mixture was composed of 20.375 mL water, 2.5 mL Taq polymerase buffer, 1 mL each of forward and reverse primers (10 mM), 0.5 mL deoxynucleoside triphosphates (dNTPs), and 0.125 mL Taq polymerase (Thermo Fisher). PCR products were purified with GeneJET PCR purification kit (Thermo Fisher) and sequenced either by Sanger sequencing (Mix2Seq; Eurofins Genomics) for the Sa3int-phages derived from the LA-MRSA strains or by using an Illumina MiSeq system (sequencing depth varied from 10,000 to 180,000 [average, 100,000]).
qPCR assay. DNA for use in the qPCR assay (LightCycler 96; Roche) was extracted using the GenElute bacterial genomic DNA kit (Sigma). The samples of interest were obtained by lysogenizing S. aureus 8325-4 and 8325-4attBmut with the respective phage (f 13kanR, f lys2, or f lys7) and plating two 100-mL portions of the culture on TSA supplemented with 30 mg/mL kanamycin. After overnight incubation, the colonies were scraped off (approximately 10,000 colonies) and resuspended in 1 mL saline. Of this, 100 mL was used directly in the first lysis step of the kit. DNA concentration was measured using a Qubit fluorometer (Invitrogen) and diluted to 1 ng/mL, of which 5 mL was used in the qPCR, where the reaction mixture consisted of 3 mL water, 10 mL 2Â FastStart Essential DNA green master, and 1 mL of each forward and reverse primers (10 mM). Primer sequences and cycling conditions can be found in Table S2.
Data availability. All genomic data used or produced in this study have been deposited at the European Nucleotide Archive (https://www.ebi.ac.uk/ena/browser/home). Accession numbers and identifiers are listed in Tables S4 and S5. Source data for the qPCR assay and Sanger amplicon sequencing can be found at https://doi.org/10.17894/ucph.d6a30dc3-54bb-430e-a90c-c4e5baefd3ca with identifiers in Table S4. Raw data can be accessed at https://www.ebi.ac.uk/ena/browser/home with identifiers listed in Table S5 and with BioProject number PRJEB44479.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.

ACKNOWLEDGMENTS
We thank Henrike Zschach for her contribution to the bioinformatic analysis and the staff of the Danish reference laboratory for staphylococci at Statens Serum Institut for typing and handling of study isolates.
This project received funding from the European Union's Horizon 2020 research no. 765147.
H.L. and H.I. designed the study; H.L. generated experimental data, did formal analysis, wrote the manuscript, and visualized the data; R.S. supported bioinformatic analysis; J.L. provided strain material; M.S. conducted sequencing; H.L., H.I., R.S., M.S., and J.L. conducted review and editing; H.I. provided funding acquisition and project administration.
We declare no conflict of interest.