Genomic analysis of a parasite invasion: Colonization of the Americas by the blood fluke Schistosoma mansoni

Abstract Schistosoma mansoni, a snail‐borne, blood fluke that infects humans, was introduced into the Americas from Africa during the Trans‐Atlantic slave trade. As this parasite shows strong specificity to the snail intermediate host, we expected that adaptation to South American Biomphalaria spp. snails would result in population bottlenecks and strong signatures of selection. We scored 475,081 single nucleotide variants in 143 S. mansoni from the Americas (Brazil, Guadeloupe and Puerto Rico) and Africa (Cameroon, Niger, Senegal, Tanzania, and Uganda), and used these data to ask: (i) Was there a population bottleneck during colonization? (ii) Can we identify signatures of selection associated with colonization? (iii) What were the source populations for colonizing parasites? We found a 2.4‐ to 2.9‐fold reduction in diversity and much slower decay in linkage disequilibrium (LD) in parasites from East to West Africa. However, we observed similar nuclear diversity and LD in West Africa and Brazil, suggesting no strong bottlenecks and limited barriers to colonization. We identified five genome regions showing selection in the Americas, compared with three in West Africa and none in East Africa, which we speculate may reflect adaptation during colonization. Finally, we infer that unsampled populations from central African regions between Benin and Angola, with contributions from Niger, are probably the major source(s) for Brazilian S. mansoni. The absence of a bottleneck suggests that this is a rare case of a serendipitous invasion, where S. mansoni parasites were pre‐adapted to the Americas and able to establish with relative ease.


| INTRODUC TI ON
Genomic characterization of parasites and pathogens is increasingly being used as an aid to traditional epidemiological methods in reconstructing transmission patterns (de Oliveira et al., 2020;Nadeau et al., 2021). On a longer timescale, genomic data can be used to understand biological invasions of pathogens into new continents, just as these methods are used for investigating biological invasions in free-living organisms (Rius et al., 2015;Sherpa & Després, 2021).
Such methods can determine the colonization route, source population, number of colonization events, whether diversity is reduced during colonization and evidence for adaptation in colonizing populations. Examining the consequences of historical invasions can inform our understanding of extant invasions.
The Trans-Atlantic slave trade lasted from 1502 to 1888 (Bergad, 2007). During this time, more than 12 million people were trafficked from Africa to slave ports in the Americas, representing one of the largest forced migration events in human history (Eltis, 2001). Along with the human cargo, a number of human pathogens were introduced into the Americas. For example, parvovirus B19 (Primate erythroparvovirus 1) and Hepatitis B virus were successfully introduced into the Americas, leading to large-scale outbreaks (Guzmán-Solís et al., 2021). Today viable populations of pathogens including herpes simplex virus 2 (Human alphaherpesvirus 2; Forni et al., 2020), Yellow fever virus (Bryant et al., 2007), the parasitic nematode Wuchereria bancrofti (Small et al., 2019), among others (Steverding, 2020), are all a direct result of introductions during the Trans-Atlantic slave trade.
In some cases, the genetic signatures of the introduction are still visible. For example, genetic diversity in South American Leishmania chagasi populations is halved and the effective population size (N e ) is reduced from 43.6 million to 15.5 thousand compared to source populations in Africa (Leblois et al., 2011;Schwabl et al., 2021). Here, we focus on another successful invasion by a human-parasitic trematode, Schistosoma mansoni.
S. mansoni is distributed from Oman, through sub-Saharan Africa, to the Caribbean and countries along the eastern coast of South America. Phylogenetic evidence indicates that S. mansoni in West Africa and the Americas are closely related (Crellen et al., 2016;Desprès et al., 1993;Fletcher et al., 1981;Morgan et al., 2005;Webster et al., 2013) and these observations, along with demographic reconstructions (Crellen et al., 2016), indicate a recent origin of S. mansoni in the Americas. As a result, there is strong evidence that S. mansoni comigrated to the Americas during the forced, human migrations of the Trans-Atlantic slave trade (Files, 1951). Furthermore, reduced diversity in mitochondrial haplotypes (Desprès et al., 1993;Fletcher et al., 1981;Morgan et al., 2005;Webster et al., 2013) Anderson & Enabulele, 2021). Eggs are expelled in human faeces (S. mansoni and S. japonicum) or urine (S. haematobium). Larvae (miracidia) hatch in fresh water and infect receptive snails. Inside the snail host, the schistosomes reproduce asexually, and second-stage larvae (cercariae) are released back into the water where they infect humans, mature into adult worms and restart their life cycle. S. mansoni is diploid, with a well-characterized 363-Mb genome (Berriman et al., 2009;International Helminth Genomes Consortium, 2019;Protasio et al., 2012), ZW sex determination, obligate sexual reproduction of adult worms, and a relatively long lifespan (5-10 years) (Fulford et al., 1995).
The distribution of the intermediate snail host is a major driver of schistosome distribution. S. mansoni shows strong specificity for species and even strains of snails in the genus Biomphalaria (Webster & Woolhouse, 1998). However, Biomphalaria pfeifferi, B. sudanica and B. alexandrina are the primary intermediate hosts in Africa (DeJong et al., 2001), while B. glabrata, B. tenagophila and B. straminea are the known snail hosts in South America (Vidigal et al., 2000). S. mansoni infections can impact the reproductive viability of their snail hosts, and there are strong co-evolutionary interactions driving resistance to infection in snails and for infectivity in parasites Theron et al., 2014;Webster et al., 2004). Several schistosome resistance genes have been localized within the snail genome (Tennessen et al., 2015(Tennessen et al., , 2020 and polymorphic loci in snails and parasites are thought to determine compatibility (Mitta et al., 2017;Webster & Woolhouse, 1998;Woolhouse & Webster, 2000). Based on these observations, we hypothesize that the adaptation to novel Biomphalaria spp. hosts would place strong selective pressures on S. mansoni as it became established in the Americas. none in East Africa, which we speculate may reflect adaptation during colonization.

Finally, we infer that unsampled populations from central African regions between
Benin and Angola, with contributions from Niger, are probably the major source(s) for Brazilian S. mansoni. The absence of a bottleneck suggests that this is a rare case of a serendipitous invasion, where S. mansoni parasites were pre-adapted to the Americas and able to establish with relative ease.

K E Y W O R D S
Africa, Brazil, codispersal, exome, human parasite, migration Adult schistosomes live in the blood vessels, making them difficult to sample. Genome and exome sequencing of schistosomes is now possible using whole genome amplification of miracidia larvae isolated from faeces or urine (Doyle et al., 2019;Le Clec'h et al., 2018;Shortt et al., 2017), and several genome-scale population analyses have recently been published (Berger et al., 2021;Platt et al., 2019;Shortt et al., 2017). Our goal here was to address the following questions with the available sequence data from both Africa (Niger, Senegal, Uganda, Tanzania)

| Data and sample information
We examined published exomic and genomic data from 178 individual Schistosoma samples/isolates, from multiple geographical locations, available from three studies (Berriman et al., 2009;Chevalier et al., 2019;Crellen et al., 2016). Exome data are from Chevalier et al. (2016) and Chevalier et al. (2019). These data were generated from individual larval miracidia hatched from Schistosoma mansoni eggs and preserved on Flinders Technology Associates ® (FTA) cards.
Exome libraries were generated via whole genome amplification followed by targeted capture of the exome .
This method specifically targets 95% (14.81 Mb) of the exome with 2× tiled probes. The whole genome sequence data came from adult worms cultured through laboratory rodents and snails for two or more generations before whole genome library preparation and sequencing (Berriman et al., 2009;Crellen et al., 2016;International Helminth Genomes Consortium, 2019). Sample origins are shown in Figure 1. Detailed metadata are available for each sample in Table   S1 including country of origin, species identification and NCBI Short Read Archive.

| Computational environment
We used conda version 4.8.3 to manage computational, virtual environments for all analyses. Sequence read filtering through genotyping steps were documented in a snakemake version 5.18.1 (Köster & Rahmann, 2012) workflow and all other analyses were performed in a series of jupyter version 1.0.0 notebooks. The code for this project, including shell scripts, snakemake workflows notebooks, and envi-
We used vcftools version 0.1.16 (Danecek et al., 2011) for additional rounds of filtering. First, we removed low-quality sites with quality score <25, read depth <12 and nonbiallelic sites. Second, we removed sites and individuals with a genotyping rate <50%. Third, we removed all sites that were on unresolved haplotigs by retaining only those SNVs that were on one of seven autosomal scaffolds (GenBank Nucleotide accessions: HE601624.2-30.2), the sex-linked ZW scaffold (HE601631.2) or the mitochondria (HE601612.2).

| Summary statistics
We quantified read depth per probed-exome region with mosdepth version 0.2.5 (Pedersen & Quinlan, 2018) and calculated genomewide summary statistics for each population, including F 3 , F ST , Tajima's D, π, and the Watterson estimator (Θ) with scikit-allele version 1.2.1 (Miles et al., 2019). We examined genome regions that were targeted by the Le  probe set for these calculations; nontarget regions (i.e., nonexomic) were ignored because most samples lacked information from these regions. F ST between populations was calculated from the average Weir-Cockerham F ST (Weir & Cockerham, 1984) in windows of 100 SNVs. Effective population size (N e ) was estimated from Θ and the mutation rate (μ = 8.1e-9 per base per generation; Crellen et al., 2016) with: We examined linkage disequilibrium (LD) within each population by calculating r 2 (--r2) values with plink version 1.90b6.18 (Purcell et al., 2007). We excluded invariant sites from the analyses. Intra-autosomal, pairwise comparisons between SNVs within 1 Mb of one another were allowed by setting the following parameters: "--ldwindow 1000000", "--ld-window-kb 1000" and "--ld-window-r2 to 0.0." r 2 values were then binned into 500-bp windows and averaged for each population using the R version 3.6.1 stats.bin function in the fields version 11.6 (Nychka et al., 2017) library. We used local regression to smooth the binned r 2 values with the loessMod function in the base R version 3.6.1 package and a span size of 0.5.
We used a Pearson Mantel test to examine correlation between genetic and physical distance. Since we did not have exact collection coordinates from whole genome samples, or they were laboratoryderived, we excluded them from the analyses and instead focused only on the S. mansoni exome samples. We calculated pairwise pdistances with vcf2dis (https://github.com/BGI-shenz hen/VCF2Dis; commit: b7684d3, accessed February 13, 2021) and physical distances between samples the Python haversine 2.3.0 module. Finally, we used the mantel() function in the scikit-bio 0.2.1 Python library to conduct a Pearson Mantel test that included 1000 permutations.

| Population structure and admixture
We examined population substructure using principal components analysis (PCA) and admixture with unlinked autosomal SNVs (described above). Two PCAs were calculated in plink version 1.90b6.18, with and without the S. rodhaini samples. Population ancestry was estimated with admixture version 1.3.0 (Alexander et al., 2009). We For D 3 , we calculated the mean-pairwise (Euclidean) distances between populations using scikit-allele's allel.pairwise_distance() function. To determine significance, we used 1000 block bootstrap replicates of 1000 SNV blocks. We calculated the average F 3 across the genome in blocks of 100 variants. Here we ran multiple tests that included some combination of an African S. mansoni population (Niger, Senegal and Tanzania) as the test group and Brazilian S. mansoni and S. rodhaini as the potential source populations. D, or the ABBA-BABBA statistic, was averaged over blocks of 1000 variants assuming a phylogeny of (((a, Tanzania), S. rodhaini), S. margrebowiei), where the a population was either Brazil, Niger or Senegal. D, D 3 and F 3 values were calculated using scikit-allel.

| Phylogenetics
We used three different phylogenetic methods to visualize relationships among sampled schistosomes: a mitochondrial N e = Θ 4 .
haplotype network, a coalescent-based species tree and a phylogenetic network.

| Mitochondrial haplotype network
We extracted mitochondrial SNVs from all S. mansoni individuals with vcftools and converted the subsequent VCF file to Nexus format with vcf2phylip version 2.0 (Ortiz, 2019). A median joining network (ε = 0) was created from the mitochondrial haplotypes with popart version 1.7 (Leigh & Bryant, 2015).

| Phylogenetic network
We used a phylogenetic network to visualize and quantify migration among schistosome populations. We only included S. mansoni populations with more than four individuals, which excluded all whole genome samples from this analysis including those from the Caribbean and the S. rodhaini samples. We used autosomal SNVs after filtering linked sites in 250-kb blocks with plink version 1.90b6.18 and then used treemix version 1.12 (Pickrell & Pritchard, 2012) to generate the phylogenetic network. This analysis used a covariance matrix generated from blocks of 500 SNVs without sample-size correction ("--noss") and the number of migration events was limited to 3.

| Selection
We scanned the genome to identify regions under selection using

| sweepfinder2
This method uses deviations in allele frequency from a neutral expectation to estimate positive selection while accounting for the possibility of background selection via a likelihood ratio (LR) test (DeGiorgio et al., 2016). Empirical site-frequency spectra were calculated for each population, and within each population LR was estimated along each autosome individually. We examined grid points ("g"), or window sizes, of 1, 5, 10 and 20 kb to accommodate possible gaps caused by exome data. These options had minimal impact on the results. Downstream analyses are reported on the runs with "g" =1 kb.

| pcadapt
We used the R version 4.0.5 package pcadapt version 4.3.3 (Luu et al., 2017) to identify highly differentiated loci among populations via variants associated with population structure as identified by PCA. We only included samples from Brazil, Niger and Senegal since our primary goal was to identify variants involved in adaptation to the Americas. Rare variants (MAF < 5%) were excluded with vcftools version 0.1.16. We identified the appropriate number of principal components from the data by running an initial pcadapt run with 20 populations (K = 20) and LD filtered variants (LD.clumping = list(size =100, thr =0.2)). The major break in the subsequent scree plot was used as the optimal K choice. We used a second pcadapt run with the optimal K and the same LD filtering parameters as the initial run to assign p values to each site. Finally, we adjusted p values for multiple tests with Bonferroni correction and an α = .05 to identify SNV outliers associated with population differentiation.

| Identifying regions of selection
We identified regions potentially under positive selection using a three-step process. First, we identified SNVs whose h-scan and sweepfinder2 values were in the 99th percentile of and greater than the neutral thresholds established with msprime. These were SNVs with the strongest signal of selection. Then, we expanded from the SNV to a broader region by merging all variants within 333,333 bp whose h-scan or sweepfinder2 values were greater than the neutral thresholds. Finally, we looked for pcadapt outliers in each region.
These regions are referred to as "putative selected regions" or "pu-  Figure S1.

| Summary of sequence data
After genotyping and filtering, we removed 25 of the 178 samples with low numbers of reads, poor coverage or low genotyping Note: "n"-number of samples; "π"-nucleotide diversity; "H"-haplotype diversity; "Θ"-Watterson estimator; "N e "-effective population size.
Abbreviations: PRS, putative region of selection; Sm, Schistosoma mansoni. a Eight of nine S. rodhaini samples came from a single laboratory population: population statistics are probably biased.

| Admixture with S. rodhaini
We asked whether hybridization with S. rodhaini, a closely related schistosome infecting rodents, might contribute to the high genetic diversity observed in East Africa vs. West Africa and South American S. mansoni. To investigate this, we used three statistics (D,  (Table 3). These results suggest that hybridization/introgression between S. mansoni and S. rodhaini may make no detectable contribution to elevated diversity in East Africa.
F I G U R E 2 Linkage disequilibrium (LD) decay and diversity within populations-(a) LD between single nucleotide variants was quantified with r 2 values for each population. Mean r 2 values were taken in 500-bp windows and loess smoothed. Vertical dotted lines indicate the distance where r 2 = .5 for each population. LD decayed to r 2 = .5 in 28 bp (Tanzania), 15,150 bp (Niger), 19,318 bp (Senegal) and 26,196 bp (Brazil). (b) Nucleotide diversity (π) varied between Schistosoma mansoni populations with the highest levels of diversity occurring in East Africa (Tanzania). π was measured in 100-kb windows across the autosomal chromosomes. Outliers are not shown

| Population structure
We examined population structure using PCA and admixture with 38,197 unlinked autosomal SNVs. Two PCAs were generated, with and without the S. rodhaini outgroup (see Figure 4). The two species were differentiated along PC1 (34.7% variance) when S. rodhaini was included ( Figure 3a). S. mansoni samples cluster into geographically defined groups when S. rodhaini is excluded (Figure 3b). We used admixture to assign individuals to one of k populations, where k is between 1 and 20 ( Figure 4). Cross-validation scores (Evanno et al., 2005) were minimized when k was 4 or 5. Both k = 4 and 5 split S. mansoni samples into geographically defined populations with two major differences. First, k = 4 showed that the allelic component primarily associated with Brazil was found at moderate levels in Cameroon and Nigerien individuals. Second k = 5 split the West African samples into a Senegalese and a Cameroonian + Nigerien population.
As observed in the PCA, the Kenyan, whole-genome sample con-

| Phylogenetics
We used three different phylogenetic methods to investigate the evolutionary relationships between populations ( Figure 5) A coalescent-based species tree from 100,819 parsimonyinformative SNVs was generated with svd-quartets (Figure 5b).
Quartet sampling was limited to 100,00 quartets which sampled 0.43% of all distinct quartets present in the alignment. The final species tree was consistent with 84.7% of all the quartets sampled.
Unlike the mitochondrial tree, samples fall into well-supported clades corresponding to geography with two exceptions. Samples from East Africa and Niger formed independent paraphyletic clades. In

| Selection
We used msprime to generate a set of neutrally evolving SNVs based on parameters specific to each of the sampled populations.
These neutrally evolving SNVs were distributed across an 88.9-Mb chromosome that was equal in size to S. mansoni chromosome 1 (HE601624.2). We then transposed the HE601624.2 exome annotation onto the simulated chromosome to extract "exome" data.
This process was repeated 342 times per population to produce a set of neutrally evolving loci to use as controls when examining selection on actual samples. We used these neutral simula- The h-scan and sweepfinder2 results are shown in Figure 6. pcadapt results are show in Figure S2. We defined "putative regions of selection" as those that have most likely experienced positive selection. These regions contain variants (i) with both H and LR values in the 99th percentile, (ii) are greater than the neutral thresholds and (iii) have a signal of population-specific directional selection. All SNVs meeting one or more of these criteria are listed in Table S4.
Our results recovered five, three and three putative selected regions in Brazil, Niger and Senegal respectively ( Figure 6; Table   S5). Information regarding the number of regions, SNVs and genes identified are presented in Tables 4 and 5. π ( Figure S3) and Tajima's D ( Figure S4) were depressed in these regions compared to genome-wide values ( We identified 116-157 genes within "putative selected regions" in the Brazilian, Nigerien and Senegalese populations (Table S6).

| DISCUSS ION
We

| Elevated East African diversity and S. mansoni expansion across Africa
A striking result from this study is the dramatic reduction in genetic diversity between East and West Africa. Sequence summary statistics indicate that the East African population has two-to three-fold greater nucleotide diversity (π), larger N e and greater mitochondrial diversity than the other populations (  (Berger et al., 2021;Faust et al., 2019;Gower et al., 2017;Huyse et al., 2013;Lelo et al., 2014).

The rapid decay in LD observed in East Africa compared with
West African and American populations provides further evidence that East African S. mansoni populations are ancestral. Similar reductions in the rate of LD decay have been observed in humans and malaria parasites, outside of their ancestral Africa range (Anderson et al., 2000;Gurdasani et al., 2015;Neafsey et al., 2008). We do not expect that human movement is a major barrier between East and West African schistosome populations. However, differences in snail-schistosome compatibility in East and West Africa may provide barriers to gene flow. This is seen in multiple host-parasite systems, including Daphnia-microsporidia (Ebert, 1994), and trematode infections of snails (Lively, 1989) and minnows (Ballabeni & Ward, 1993). phylogeography  and compatibility relationships among East African snail species (B sudanica, B. pffeiferi and B. choanomphala) and S. mansoni (Mutuku et al., 2017(Mutuku et al., , 2021, but further research is needed to understand the compatibility of allopatric snail-schistosome combinations from East and West Africa.
We suggest that the presence of fine-scale geographical structure of Biomphalaria populations  and local adaptation in sympatric Biomphalaria-schistosome combinations may limit parasite geneflow between East and West Africa.

| Does hybridization between S. rodhaini and S. mansoni contribute to elevated East African diversity
Several closely related Schistosoma species are able to hybridize and produce viable offspring, as confirmed via experimental rodent in-

fections. The potential for hybridization between animal and human
Schistosoma species is a significant public health concern (Borlase et al., 2021;Leger & Webster, 2017;Léger et al., 2020;Stothard et al., 2020). Our group, and others, have recently shown that ancient hybridization and adaptive introgression has resulted in the transfer of genes from the livestock species Schistosoma bovis into S. haematobium: West African S. haematobium genomes contain 3%-8% introgressed S. bovis sequences and S. bovis alleles have reached a high frequency in some genome regions (Platt et al., 2019;Rey, Toulza, et al., 2021). The sister species of S. mansoni, S. rodhaini, parasitizes rodents and is primarily located in eastern Africa . S. mansoni and S. rodhaini have been shown to readily hybridize in and produce fertile offspring in the laboratory (Théron, 1989). Natural hybrids have been reported in Kenya and Tanzania Steinauer, Hanelt, et al., 2008), although hybrids have only been detected from their snail intermediate host and never encountered in the mammalian hosts, humans and rodents . We were unable to find evidence of hybridization in 55 samples collected from Tanzania Table 3).
Hybridization between these two species is thought to be rare (≤7.2%) Steinauer, Hanelt, et al., 2008;. Our sample size may not be large enough to identify rare hybrids. Furthermore, we analysed exome (coding) data, which may underrepresent introgressed alleles if they are selected against. Finally, the S. rodhaini samples are primarily from a single laboratory population that may not be representative of natural populations. These caveats aside, our analyses clearly failed to identify recent hybridization between S. mansoni and S. rodhaini. We conclude that S. rodhaini introgression does not contribute to the high genetic diversity in our Tanzanian S. mansoni samples.

| Expansion into the Americas
Previous work has shown that S. mansoni was exported from Africa to the Americas during the Trans-Atlantic slave trade (Crellen et al., 2016;Desprès et al., 1993;Files, 1951;Fletcher et al., 1981;Morgan et al., 2005;Webster et al., 2013). Here, we use genomic data to investigate the probable source population(s), number of introductions, evidence for bottlenecks and parasite adaptation during colonization.  Niger (Figure 5c), confirming a relationship between these two populations.

| Source populations
A simple hypothesis from the data is that, assuming a general east-to-west expansion holds at finer geographical scales, the source population that was eventually exported to Brazil is probably located somewhere between Benin and Angola. These This location is relatively equidistant from major slave ports in Bahia (527 km) and around Rio de Janeiro (706 km). In all, more than 3.5 million slaves were transported to Brazil (  Table S2).
The nuclear genomic data clearly suggest that the establishment of S. mansoni in Brazil was not associated with a significant population bottleneck and had minimal impacts on genome-wide levels of genetic diversity. The discrepancy between mtDNA and nuclear DNA may stem from two sources. First, mtDNA has an effective population size one-quarter that of nuclear genes (Birky et al., 1983), and can potentially provide a more sensitive indicator of bottlenecks.
Second, and perhaps more critical, mtDNA constitutes a single marker, so may poorly reflect population history (Anderson, 2001).
Extensive laboratory passage may also result in bias in population summary statistics. For example, the Caribbean samples examined have undergone 2-15 generations of laboratory passage, which is probably responsible for the elevated Tajima's D in this population (Crellen et al., 2016).

| Number of introductions
All Brazilian and Caribbean samples are paraphyletic and fall between the Cameroonian sample and West African clade in Figure 5b.
The relationships among these samples are resolved but not supported outside of a monophyletic clade containing the Brazilian samples. As a result, the autosomal phylogeny by itself does not conclusively support one or multiple introductions into the Americas.
The two Caribbean samples from Guadeloupe island contain unique mitochondrial haplotypes absent from Brazil ( Figure 5a)

| Adaptation during colonization
We hypothesized that S. mansoni introduced into the Americas would have been exposed to novel selective pressures as they adapted to new biotic and abiotic challenges. For example, the S. mansoni life cycle requires an intermediate snail host in which miracidia mature into cercariae that are capable of infecting humans. Previous work has shown the snail immune response is greater when exposed to sympatric parasite strains (Portet et al., 2019) and, in general, sympatric host/parasite combinations to be more compatible (Morand et al., 1996) (Figure 1). Instead, these parasites have adapted to using different Biomphalaria hosts, including B. glabrata, B. straminea and B. tenagophila (Hailegebriel et al., 2020;Kengne-Fokam et al., 2018;Vidigal et al., 2000). We examined exomic SNV data to identify genes and larger regions of the genome under selection at a finer scale and identified zero to five putative regions of selection from each of the major populations (Table 4).
In the Brazilian samples, we identified five putative selected regions that contain 126 genes (Table S6). π and Tajima's D were significantly reduced in these five regions compared to genome-wide averages, which is expected if these loci are, or have been, under selection (Table S3). One region is shared between the Brazilian and Senegalese populations. Forty-six genes fall within this Senegal-Brazil overlapping region, leaving 80 genes and four loci that are probably experiencing population-specific positive selection. Even within the group of 80 genes, there are nine with strong signals of selection (Table 5). These genes contained variants with H and LR values in the 99th percentile in addition to being greater than the threshold defined by neutral simulations. Several genes within this group are associated with housekeeping functions, including transcription and protein degradation (Smp_060090, 40S ribosomal protein S12; Smp_162000, UBR-type domain-containing protein; Smp_246630, UBC core domain-containing protein). Two uncharacterized proteins were identified (Smp_341570 andSmp_123590) but we are not able to speculate on their function. Of particular interest are two possible transcription factors, an uncharacterized protein containing a helix, loop, helix domain (Smp_123570, BHLH domain-containing protein) and a putative TATA-box binding protein (Smp_073680, Putative TATA-box binding protein). It is possible that adaptation to the Brazilian environment was driven by changes in gene expression, but more work is needed to understand the potential role of these loci in adaptation to the Americas.

| Selection on African S. mansoni
We examined selection on African S. mansoni as part of the process to identify unique signals of selection in the Brazilian population.
We identified 112 and 157 genes under selection in three regions each for the Nigerien and Senegalese populations ( Table 4). One of the three regions, and 22 genes, was shared between Niger and Senegal. We failed to identify any regions of selection in Tanzania using our combined criteria, but results from individual tests of selection (hscan, sweepfinder2) did overlap at nine of 25 regions (Table   S8) identified in a large Ugandan population (Berger et al., 2021).
These regions were identified using a variety of within-(iHS) and between-population (F ST , XP-EHH) tests on miracidia isolated from two Ugandan locations with differing histories of praziquantel treatment. (RR374-053/5054146 and RR374-053/4785426) for the SCORE project.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

O PEN R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5061/ dryad.dv41n s209.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data used in this paper were previously published (Berriman et al., 2009;Chevalier et al., 2019;Crellen et al., 2016