The genomic architecture of introgression among sibling species of bacteria

Gene transfer between bacterial species is an important mechanism for adaptation. For example, sets of genes that confer the ability to form nitrogen-fixing root nodules on host plants have frequently moved betweenRhizobiumspecies. It is not clear, though, whether such transfer is exceptional, or if frequent inter-species introgression is typical. To address this, we sequenced the genomes of 196 isolates of theRhizobium leguminosarumspecies complex obtained from root nodules of white clover (Trifolium repens).Core gene phylogeny placed the isolates into five distinct genospecies that show high intra-genospecies recombination rates and remarkably different demographic histories. Most gene phylogenies were largely concordant with the genospecies, indicating that recent gene transfer between genospecies was rare. In contrast, very similar symbiosis gene sequences were found in two or more genospecies, suggesting recent horizontal transfer. The replication and conjugative transfer genes of the plasmids carrying the symbiosis genes showed a similar pattern, implying that introgression occurred by conjugative plasmid transfer. The only other regions that showed strong phylogenetic discordance with the genospecies classification were two small chromosomal clusters, one neighbouring a conjugative transfer system. Phage-related sequences were observed in the genomes, but appeared to have very limited impact on introgression.Introgression among these closely-related species has been very limited, confined to the symbiosis plasmids and a few chromosomal islands. Both introgress through conjugative transfer, but have been subject to different types of selective forces.


Background
The promiscuity of bacteria, and their ability to rapidly transfer DNA, has in the 1 last years challenged microbiologists and geneticists seeking to integrate prokary-2 otes into standard models of speciation [1,2,3,4]. The dynamic nature of acquisi-3 tion, loss and transfer of genes in these organisms goes beyond the recombinational 4 process and vertical inheritance, forcing a redesign of the speciation models for 5 prokaryotes [5,6,7]. 6 In contrast to most eukaryotes, which have mutation and meiotic recombina-7 tion as the main adaptive drivers, bacterial species rapidly adapt through other 8 types of genetic exchange: transformation (through the cell membrane), transduc-9 tion (through a vector), and conjugation (cell-to-cell contact) [8,9]. These processes 10 can move adaptive genes between distantly related species, creating regions of high 11 genetic similarity. 12 When describing prokaryotic genomes, an important distinction must be made 13 between core and accessory genomes. The core genome is the set of ubiquitous genes 14 within a defined group, such as a species. These genes often include housekeeping 15 genes and are generally found in the chromosome. In certain species, core genes are 16 also found on chromids, which are large plasmids that have acquired chromosomal 17 characteristics [10,11]. The accessory genome is a pool of non-ubiquitous genes that 18 can provide a bacterial strain with adaptive advantages, for instance with respect 19 to host interaction, antibiotic resistance, or heavy metal resistance [12,13,14]. The 20 accessory genome is mainly found in the accessory plasmids, but also in islands in 21 the chromosome and chromids. 22 Genetic divergence among closely related species can arise by ecological and ge-23 netic processes. Ecologically distinct niches may select genotypes with different 24 adaptations [15,16,17]. This model, known as the ecotype model, is frequently 25 observed in nature. In sympatric populations of the aquatic bacterioplankton of the 26 family Vibrionaceae for example, phylogenetic differentiation was observed to be 27 initiated by a change in ecological niche [18,19]. 28 Another possible factor for the isolation of sibling species is recombinational in-29 compatibility [20,16]. Multiple experimental studies of bacterial recombination have 30 revealed that homologous recombination between prokaryotes may be restricted by 31 sequence divergence between donor and recipient [21,22], since sequence mismatches 32 Within-species variation 115 Variation within and between genospecies was investigated by characterizing nu-116 cleotide diversity, Site Frequency Spectra (SFS), Tajima's D, and decay of Linkage 117 Disequilibrium (LD) with genomic distance (Fig. 3, see Methods). 118 The average nucleotide diversity differs by a factor of 5 among genospecies, and 119 is higher for accessory than core genes and slightly higher for genes located on 120 chromids compared to the chromosome (Fig. 3a). This is consistent with stronger 121 purifying selection acting on essential genes. 122 The site frequency spectra are shown separately for synonymous and non-123 synonymous sites for genospecies A, B and C (Fig. 3b). Overall, the peaks of inter-124 mediate frequency SNPs reflect the population structure within each genospecies. 125 For synonymous SNPs, the shape of the SFS differs among genospecies with 126 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; genospecies C having a larger proportion of rare variants and genospecies A hav-  (Fig. 3c). Contrasting synonymous and non-synonymous SFS for each genospecies 132 we find a relative excess of rare non-synonymous variants consistent with segrega-133 tion of non-synonymous variation under weak purifying selection. 134 We assessed the decay in intragenic linkage disequilibrium with distance using

143
From all 196 genomes, 24 distinct RepA sequence groups were identified. However, 144 four of these correspond to isolated repA-like genes that are not part of repABC 145 operons, and twelve others are rare (in no more than four genomes), so eight types 146 account for nearly all the plasmids (Fig. 4a). We numbered them Rh01 to Rh08 147 in order of decreasing frequency in the set of genomes. Of these, Rh01 and Rh02, 148 corresponding to the two chromids pRL12 and pRL11 of the reference strain 3841 149 [10], are present in every genome. The distribution of the other plasmids shows 150 some dependence on genospecies, but none is confined to a single genospecies. For 151 example, Rh03 is present in all strains of gsA, gsB and gsC, but absent from gsE 152 and in just one gsD strain, while Rh05 is universal in gsA and gsB but absent 153 elsewhere. The phylogeny of repA genes within individual plasmid groups sheds 154 light on their history of transfer between and within genospecies. In groups Rh01 to 155 Rh05, each clade in the phylogeny contains strains of a single genospecies, providing 156 no evidence for recent transfer of these plasmids between genospecies. of incomplete genome assembly, but the overall picture is clear. Genospecies A 160 symbiosis plasmids are all Rh06, in gsB they are Rh07, gsC has mostly Rh04 but 161 some Rh07 and Rh08, gsD has Rh08, gsE has mostly Rh08 but some Rh06 and Rh07.

162
There are striking differences in the apparent mobility of these plasmids. Conjugal 163 transfer genes (tra and trb) are present in some Rh04 plasmids and all Rh07 and 164 Rh08 plasmids, including those that are symbiosis plasmids. These genes are all 165 located together immediately upstream of the repABC replication and partitioning 166 operon, in the same arrangement as in the plasmid p42a of R. etli CFN42, which 167 has been classified as a Class I, Group I conjugation system [40].  (Fig. 5a). High intergenic correlations were restricted 187 to genes within each compartment; few inter-compartment interactions were ob-188 served. Interestingly, we found that the symbiosis plasmids maintained high levels 189 of intergenic LD, suggesting that this plasmid has been recently acquired (Fig. 5b).

190
. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; Intergenic LD between all pairs of symbiosis genes showed clear blocks of linkage 191 disequilibrium similar to those that have been previously described [41] (Fig. 5c).

192
The small LD blocks within the symbiosis cluster agree with functionality: nod genes  Interestingly, the majority of these strains originated from organic fields.

205
In order to understand if genomic introgression among these sibling species was 206 restricted to the sym plasmid, analysis of the evolutionary history of single genes 207 was conducted. We calculated the discordance between the gene trees and the 208 genospecies classification (discordance score, Additional file 1: Fig. S15; Methods).

209
If a gene tree resembles the genospecies topology of the species tree, where distinct 210 clades of genospecies are observed, then the gene would have a zero discordance 211 score. The results showed that around 20% of the genes have no evidence for trans-212 fer between genospecies (discordance equal to zero), 35% have a discordance score 213 of 1, and 16% have a discordance score of 2, indicating that the majority of the 214 genes closely follow the species phylogeny. Symbiosis genes are in the tail of this 215 distribution with a discordance score above 6 ( Fig. 7a), in accordance with our 216 expectations based on our observation of sym-plasmid introgression.

217
Population genetic parameters were contrasted between symbiosis genes and other 218 classes of gene (Table 1, Additional file 2: Table S6). The results show that the 219 level of polymorphism overall is similar for symbiosis genes and other genes but 220 that the diversity is distributed differently. In symbiosis genes, identical or near-221 identical haplotypes are more often observed even across several genospecies (Fig.   222 6). However, several distinct groups of haplotypes exist yielding a very high Tajima's 223 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; D for symbiosis genes (Additional file 2: Table S7). This suggests either selective 224 sweeps within these groups, some form of balancing selection among groups, or a 225 combination of both.

226
By plotting discordance scores to gene locations based on a PacBio reference 227 genome (SM3), we observed that highly introgressed genes are concentrated in the 228 smaller plasmids (Fig. 7b). This reflects the most frequent mode of exchange of the 229 symbiosis plasmids, where entire sym-plasmids are transferred through conjugation  Chromosomal introgression is restricted to few events 234 We identified two specific chromosomal regions where introgression events predom-235 inantly occur. Cluster 1 ( Fig. 8b and c, Additional file 2: Tables S8 and S9) was 236 consistently found in the same region in 87 strains (64 gsC, 23 gsB) downstream 237 of a core phasin gene. The cluster comprises two regions of accessory genes with 238 higher than average discordance scores flanking a region of core genes that probably 239 travels with them and also has elevated discordance (Fig. 8b, Fig. S16).

240
Cluster 1 encodes a type IV secretion system (T4SS) in many strains, and this 241 T4SS bears a striking resemblance to one of the three T4SSs of Agrobacterium 242 tumefaciens C58. Two of these systems, Trb and AvhB, mediate conjugal transfer 243 of Ti and pAtC58 plasmids, respectively, between Agrobacterium cells, whereas the   CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; ligase, a metallophosphatase superfamily gene, and a high number of hypothetical 256 proteins (Additional file 2: Table S9). In 104 strains there was no insert at the start of 257 cluster 1. All strains have a discordant cluster of polysaccharide metabolism genes, 258 which seems to travel with the chromosomal island, but these genes are distinctive 259 in strains without the initial insert, such as SM4 and SM100 (Fig. 8c).

260
Cluster 2 (Fig. 8d 268 We have also evaluated population genetic parameters of highly discordant chro-269 mosomal genes (Additional file 2: Table S8). In contrast to the symbiosis genes,   Table S10). The most abundant phage protein identified was a putative por-286 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; tal protein homologous to that in Brucella phage Pr (gi418487847), which is an 287 essential component of stable DNA encapsidation [54]. 288 Phylogenetic analysis shows that individual homologous phage proteins have the 289 tendency to cluster by genospecies; however, due to high conservation of protein se-290 quences, different genospecies are found in the same clades. We therefore speculate 291 that phages have the ability to transduce between genospecies, but are more often   Only the two strains from gsA (SM154C and SM163B) showed potential evidence 308 for recent phage introgression near the cluster, with four orthologous phage proteins 309 located exactly the same base pair distance from the cluster start site in both strains.

311
Five related but distinct genospecies can be found in sympatry 312 We have assembled the genomes of 196 Rhizobium leguminosarum strains, which 313 were isolated from root nodules of white clover (Trifolium repens) in three different CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; observed clear patterns of genomic clustering into five genospecies as previously 319 reported [38] (Figure 1a). The average nucleotide identity of conserved core genes 320 and the number of shared orthologous genes ( Fig. 1b and c) also reflected the five 321 distinct genospecies. Multiple genospecies were observed at the same field site, as 322 previously reported [38]. The distinct genospecies thus coexist in sympatry, but 323 remain genetically well separated.

324
The core genomes of the genospecies are completely diverged 325 Although sympatry is observed, analysis of individual gene trees showed that hori-326 zontal gene transfer has been mainly confined to symbiosis plasmids and two chro-  CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24 The genospecies studied here displayed a diverse set of plasmid profiles (Fig. 4a), 357 as has been previously described in these and other Rhizobium species [10, 67, 68].

358
The distribution of these plasmids shows some dependence on genospecies, but no 359 plasmid type is confined to a single species, and plasmids therefore seem to have 360 been transferred among genospecies. Symbiosis plasmids can belong to any of a 361 number of plasmid types (Rh04, Rh06, Rh07 and Rh08), and phylogenetic evidence 362 indicated that some of them have been transferred through conjugation between 363 different genospecies (Fig. 4b). These transfers are likely recent, since the sequences 364 have not yet diverged at all. Because conjugation requires cell-to-cell contact, it is 365 evident that plasmid transfer is not just constrained by genetic similarity [69,33], 366 but also by the requirement that donor and recipient are found in the same location, 367 again underlying the sympatric nature of these sibling species.

368
Chromosomal introgression events were detected based on phylogenetic discordance 369 Evidence for sym-plasmid transfer between genospecies was also observed when for regions of the genome that significantly deviate from the genomic average [76]. 377 These approaches rely on the uniformity of the host signature and on a relative 378 distant origin of the exogenous sequences [73]. For many HGT events these as- CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; large number of genomes, combined with a well-defined species tree and carefully 383 pruned orthologous gene groups, gave us enough power to confidently find genes 384 strongly deviating from the species phylogeny.

385
Based on our phylogenetic method, we identified two events of chromosomal in-386 trogression where clusters of genes were transferred between genospecies. Cluster  The avhB gene cassette and the traG gene in cluster 1 also show similar organ-393 isation to a conjugative transfer system encoded by the virB/traG of the plasmid 394 pSymA of S. meliloti [80,81] and to the virB/virD4 of Bartonella tribocorum [82]. , Bacillus subtilis: [85], V. cholerae: [86]).

401
In cluster 2 we found toxin-antitoxin (TA) genes located within the cluster, but we 402 could not determine a putative transfer mechanism. The maintenance of integrative 403 conjugative elements (ICE) is in many cases mediated by the presence of functional 404 toxin-antitoxins [87, 88,89]. The loss of these TA genes causes a post-segregational 405 killing of the bacterial cell by the toxin's destructive effect [88]. Chromosomally-406 encoded TA systems have been shown to protect against large-scale deletion of 407 genomic islands [90], but have also been reported to have different functions in the 408 host [88]. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; proximally located positively selected genes. This could be the reason that we see 416 striking discordance peaks in the two chromosomal islands (Fig. 7b). MGEs can also 417 be viewed as elements with independent evolutionary trajectories to their host. The 418 presence of a toxin-antitoxin system placed close to the second cluster shows one of 419 the possible strategies that these elements deploy to increase their own fitness and 420 vertical propagation. 421 Our results indicate that conjugation is the predominant mechanism of intro-422 gression among the five genospecies, but we also investigated the effect of phage-  Fig. S11-S12). Although we observed high linkage dise-437 quilibrium within sym-clusters, symbiosis genes did not appear to be linked to the 438 chromosomal islands (Fig 6). 439 We found significantly positive values of Tajima's D for the symbiosis genes, which 440 indicates the presence of several distinct groups of haplotypes. This distinguished 441 the symbiosis genes not just from the core genome, but also from most of the 442 accessory gene set (Fig. 7, Table 1). Evidence for similar balancing selection of  Table S1). CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; possible until the scaffold became circular or no further extension was justified, and 512 unique contigs that remained unconnected to chromosomal or plasmid scaffolds were 513 extended. Finally, scaffolds were connected if their ends had appropriately spaced 514 matches in the reference genomes. Scaffold sequences were assembled using over-515 lap sequences to splice adjacent contigs exactly, or inserting an arbitrary spacer of   CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; by analyzing the synteny of homologous genes surrounded by a 40-gene neighbour-544 hood (see Synteny section). After this filtering step, the orthologous gene groups 545 were aligned using ClustalO ([105], v. 1.2.0). Each gene sequence was translated to 546 its corresponding amino acid sequence before alignment and back-translated to the 547 original nucleotides. Each gap was replaced by 3 gaps, resulting in a codon-aware 548 nucleotide alignment. Manual check of highly diverse genes (nucleotide diversity 549 > 0.2) was conducted. We observed that many of these genes were composed of 550 fragmented/partial genes, wrongly assigned orthologous groups, composed of few 551 taxa and were enriched for "hypothetical proteins" annotation. Therefore, for the 552 population genetic analysis we filtered out these possibly problematic genes with a 553 ANI cutoff equal to 0.65.   CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; criteria we ended up with 6,529 genes and 441,287 SNPs. Scripts and pipelines are 576 available at a github repository [107]. Since all sequenced strains were isolated from white clover nodules, they are ex-591 pected to carry the canonical symbiosis genes. One strain, SM168B, carried no 592 symbiosis genes. Subsequent nodulation tests showed that the strain could colonize 593 white clover and produce pink nodules, suggesting that the genes were lost during 594 the pre-sequencing processing. On the other hand, strains SM165B and SM95 were 595 found to have duplicated symbiosis regions.

596
Average nucleotide identity of core genes 597 In order to place 196 strains into the previously described genospecies [38], a phy-598 logenetic tree was first constructed based on a single gene (rpoB) (Additional file  Table S4).

604
. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24  CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; analysis was done within each population, therefore, we did not use the corrected 636 genotype matrices. the 'decorrelation' of genotype matrix X was done by multiplying X by the inverse 656 of the square root ofV as follows: T is therefore the pseudo SNP matrix, which is corrected for population structure.

659
The correlation between genes matrices was obtained by applying a Mantel test to . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; The standardized Mantel test is actually the Pearson correlation between the 667 elements of genes X and Y . and before visiting its right child, searching deeper in the tree whenever possible.

673
When the leaf of the tree was reached, the strain number and its genospecies origin 674 were extracted. A list containing the genospecies was stored for the entire tree. The 675 discordance score was computed as following:

677
The discordance score evaluates the number of times a shift (from one genospecies 678 to another) is observed in a branch. The minimum possible is the total number of 679 genospecies -1 shifts. A tree congruent to the species tree must have a discordance 680 score equal to zero. (Additional data 1: Fig. S15).

681
Competing interests 682 The authors declare that they have no competing interests. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019; CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24, 2019;    . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Figure 5 Different intensities of LD between compartments and evidence of HGT. (a)
Intergenic LD was calculated for each genomic compartment of strain SM3 (578, 468, 249, 228, 133 genes are present in plasmids Rh01 Rh02, Rh03, Rh05 and Rh07 respectively). The mean intergenic r 2 is: Rh01=0.11; Rh02=0.15; Rh03=0.11; Rh05=0.14; Rh07=0. 15     . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  Figure S1-2. Map of soil sampling locations; Figure S3. Pacbio assembly stats; Figure S4. Spades and Jigome 958 assembly; Figure S5. Overall assembly stats; Figure S6. Phylogenetic tree based on rpoB; Figure S7. Pan genome 959 analysis; Figure S8. Population genetics stats; Figure S9. Structural rearrangements between genospecies; Figure   960 S10. repA phylogeny of plasmid Rh07; Figure S11. Phylogenies of tra genes of plasmid Rh08; Figure S12-13. 961 Population structure effects on LD estimates; Figure S14. Species tree; Figure S15. Discordance score scheme; 962 Figure S16. Chromosomal introgression islands; Figure S17. Introgression mediated by phage; Figure S18. 963 Discordance score distribution across genomic compartments. 964 Additional file 2 -Excel spreadsheet with multiple data 965 This file is a multi-page table composed of the following information: 966 • . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint .  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24   . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/526707 doi: bioRxiv preprint first posted online Jan. 24   . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. Membrane-associated solute transport AvhB conjugation system Figure 8 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.