Cellular barcoding of protozoan pathogens reveals the within-host population dynamics of Toxoplasma gondii host colonization

Summary Cellular barcoding techniques are powerful tools to understand microbial pathogenesis. However, barcoding strategies have not been broadly applied to protozoan parasites, which have unique genomic structures and virulence strategies compared with viral and bacterial pathogens. Here, we present a CRISPR-based method to barcode protozoa, which we successfully apply to Toxoplasma gondii and Trypanosoma brucei. Using libraries of barcoded T. gondii, we evaluate shifts in the population structure from acute to chronic infection of mice. Contrary to expectation, most barcodes were present in the brain one month post-intraperitoneal infection in both inbred CBA/J and outbred Swiss mice. Although parasite cyst number and barcode diversity declined over time, barcodes representing a minor fraction of the inoculum could become a dominant population in the brain by three months post-infection. These data establish a cellular barcoding approach for protozoa and evidence that the blood-brain barrier is not a major bottleneck to colonization by T. gondii.


INTRODUCTION
Microbial colonization of tissues within the host organism is a key feature of host-pathogen interactions and is often responsible for the pathology of the associated disease (Ribet and Cossart, 2015). For example, colonization of the brain by the eukaryotic protozoan parasites such as Toxoplasma gondii, Trypanosoma brucei, and Plasmodium spp. can lead to encephalitis in immune compromised or severely infected hosts (Luft et al., 1993). In the case of T. gondii infection, the acute phase of infection, whether initiated via the natural, per oral route or intraperitoneal (i.p.) inoc-ulation, is characterized by rapid tachyzoite replication and dissemination to nearly every tissue in the host. The chronic phase of infection is accompanied by systemic immune clearance and parasite conversion to slower growing, encysted bradyzoites that persist in skeletal muscle and the brain (Wohlfert et al., 2017). The ''immune privileged'' quality of the chronic infection tissue niches such as the brain coupled with the protective properties of the T. gondii cyst wall are essential for transmission to the next host via predation. This is a unique aspect of T. gondii's life cycle compared with related protozoa (Dubey, 1997;Wohlfert et al., 2017). For example, colonization of the MOTIVATION Toxoplasma gondii's chronic infection of the host is associated with brain colonization and parasite life cycle stage differentiation, but how this affects T. gondii's within-host population dynamics is unclear. Cellular barcoding has been used to provide these insights for viruses and bacteria, but these methods have not been widely adapted to eukaryotic parasites such as T. gondii. Such knowledge can reveal selection bottlenecks, advancing our understanding of how this aspect of the within-host-pathogen interaction influences pathogenesis.
brain niche by the distantly related protozoan T. brucei brucei contributes to trypanosomiasis pathology (Rodgers, 2010). However, for T. brucei, this is likely a dead-end niche, as evidenced by the severe, rapidly lethal pathology and no means of transmission to the tsetse fly or mosquito vector once in the CNS.
For T. gondii infection, it follows that intraspecific competition prior to and during brain colonization is expected to have an impact on long-term success of a clone. Despite advances in our molecular understanding of tachyzoite-to-bradyzoite differentiation (Barrett et al., 2019;Waldman et al., 2020), our comprehension of T. gondii's within-host population dynamics remains limited. One critical function of tissue barriers, including the blood-brain barrier (BBB) and the intestinal epithelial barrier, is to physically restrict pathogen access (Elsheikha and Khan, 2010;Kim, 2008). Notably, stringent bottlenecks are experienced by the related parasite Plasmodium spp. during transmission to and from the mosquito host, with these points of population restriction key targets for transmission-blocking therapeutic intervention strategies (Graumans et al., 2020;Sinden, 2017). It is anticipated that the BBB imposes a selection bottleneck upon the T. gondii population as the chronic infection is established. However, we lack tools to determine if the brain is colonized infrequently by a small number of parasites that subsequently expand within the tissue niche or if the host brain broadly permissive to colonization by T. gondii.
DNA-based cellular barcoding has been instrumental for understanding how the host and the infection influences genetic complexity of pathogen populations, and colonization dynamics (Blundell and Levy, 2014;Kebschull and Zador, 2018). Early studies using restriction site-tagged poliovirus identified a bottleneck limiting the genetic diversity of viral quasi-species transmitted to the murine brain (Pfeiffer and Kirkegaard, 2006). Furthermore, studies using wild-type isogenic tagged strains (WITS) of Salmonella have provided insights into selection bottlenecks experienced during gastro-intestinal and invasive bacterial colonization (Grant et al., 2008;Lim et al., 2014). The population structure of Salmonella infection was found to experience dramatic selective bottlenecks during the colonization of the gut niche, with the predominant colonizing strain also the dominant strain transmitted by super shedders (Lam and Monack, 2014). The generation of WITS requires the insertion of DNA barcodes into the genome of the infectious agent (Grant et al., 2008). Distinct from applications in which barcodes are used to track mutant populations, these cellular barcodes allow the complexity of the population to be closely monitored and mapped over the course of an infection using quantitative next-generation sequencing (NGS) approaches. Unexpectedly, cellular barcoding has not been widely used to study the within-host population structure of wild-type protozoan pathogens.

RESULTS
Protozoan pathogens can be cellularly barcoded with a simple CRISPR-based strategy Inspired by WITS (Grant et al., 2008), we sought to establish a versatile system to cellularly barcode eukaryotic pathogens. We considered that our approach should (1) modify cells to har-bor single barcodes at an identical position within a non-essential genomic locus and (2) include a selection strategy to ensure that a successfully barcoded cell could be enriched to represent the entire population. This would ensure that the barcoded population was isogenic apart from the unique barcode identifier. Established CRISPR-Cas9 tools for T. gondii and T. brucei (Rico et al., 2018;Shen et al., 2014;Sidik et al., 2014) were adapted to mediate site-specific recombination of DNA barcodes into non-essential genes that encode negative selectable markers in T. gondii tachyzoites or T. brucei trypomastigotes ( Figure 1A). Notably, T. brucei naturally lacks non-homologous end-joining (NHEJ) machinery, and T. gondii parasites deficient for NHEJ (RHDku80) were used to increase recombination efficiency (Huynh and Carruthers, 2009). Parasites were co-transfected with a plasmid encoding both the Cas9 nuclease and guide RNA (gRNA) scaffold (Shen et al., 2014), and a unique 60 nt single-stranded donor template encoding the barcode. Successful targeting of Cas9 and genomic integration of the barcode disrupted the UPRT coding sequence, conferring resistance to the prodrug 5-fluorodeoxyuridine (FUDR) (Donald and Roos, 1995). To prevent further modification of the UPRT locus following a single barcode integration event, our strategy also deleted both the protospacer DNA sequence recognized by the CRISPR gRNA and the protospacer adjacent motif (PAM). We confirmed barcode integration and the expected genomic modification by Sanger sequencing of the UPRT locus in the drug-resistant parasite population ( Figure 1B). A similar cellular barcoding strategy was applied to the bloodstream form T. brucei whereby Cas9 was targeted to the haploid AAT6 locus (Rico et al., 2018), a single-copy non-essential gene that confers sensitivity to eflornithine (Vincent et al., 2010) (Figure 1A). Parasites stably expressing Cas9 were co-transfected with AAT6targeting sgDNAs and a double-stranded barcoding oligo. Successful barcode integration into the AAT6 locus conferred resistance to eflornithine. Sanger sequencing confirmed the expected genomic rearrangement and barcoding of the T. brucei genome ( Figure 1B). These data indicate our barcoding strategy is an efficient tool to insert unique a barcode sequence at specific genomic positions in two divergent eukaryotic pathogens.

Barcode alleles can be identified and quantified in complex populations
To examine the stability of individual barcodes across a population of uniquely labeled parasites and to evaluate the sensitivity of barcode detection, we focused on T. gondii. First, we generated a plate-mapped library of 96 uniquely barcoded parasites lines using multiplexed transfection ( Figure 1C). We reasoned that this multiplexed strategy would be useful for future chemical or genetic screening applications applied to each of the 96 uniquely barcoded strains prior to pooling. To quantify the relative representation of individual barcodes after pooling, an NGS pipeline was developed. DNA was extracted from the pooled parasite library, the barcoded region of the UPRT locus was amplified for Illumina sequencing, and barcodes representation was quantified using Galaxy (Figures 1C and S1A) (Afgan et al., 2018;Smith et al., 2009). Our pipeline successfully identified all 96 uniquely barcoded strains from the pooled population ( Figure 1D). Notably, wells from one row of the 96-well plate were Report ll consistently more highly represented in the pooled population, reflecting a technical challenge of the 96-well transfection method ( Figure 1D). Although unplanned, this observation suggested that the NGS readout was highly sensitive to differences in barcode frequency in the pooled population. This was confirmed experimentally, as greater variation in barcode representation was observed between biological replicate transfections (Figures S1B and S1C) than between technical replicates of DNA indexed for NGS sequencing (Figures S1B, S1D, and S1E), underscoring the reproducibility of the NGS pipeline. We anticipate that the sources of variability could include parasite number and stochastic DNA extraction variability. We next tested if we could identify groups of barcodes present at different frequencies within a single complex population. We generated a 2-fold dilution series of the pooled parasite population prior to isolating DNA for NGS sequencing. A positive correlation was observed between the number of reads (read output) and the number of parasites in the input ( Figure 1E; r = 0.9954). This confirmed that barcodes could be successfully identified and reliably quantified within libraries of at least 96-member complexity.

Multiplex barcoded parasite libraries are stably maintained in vitro and in vivo
To test whether the complexity of the barcode libraries was stably maintained in vitro the pooled library of barcoded parasites was serially passed through human foreskin fibroblasts (HFFs). Lysed-out parasite cultures were sampled every passage ($36 h) for a period of six passages, equivalent to six lytic growth cycles (invasion, replication, egress). Relative to the input, the genetic complexity of the barcode population in vitro was remarkably stable after one lytic cycle or six passages (Figures 1F and 1G). We next tested our ability to propagate and recover the barcode library in vivo within a murine host using a pooled library of 63 barcodes (in this pilot experiment, some wells were not efficiently transfected). An inoculum of 2 3 10 6 parasites was injected intraperitoneally into C57BL/6 mice. After 36 h, parasites were isolated from the peritoneal cavity by lavage and compared with the input population. A strong positive correlation was observed between the input and peritoneal exudate populations ( Figure 1H; Pearson correlation coefficient [PCC] r = 0.98), demonstrating that the genetic complexity of the multiplex barcode library was stable over the first 36 h of in vivo infection.
Libraries of barcode alleles can be generated by a ''onepot'' transfection method One challenge of the multiplex platform arose from experimentto-experiment variability in transfection efficiency across individual wells ( Figures 1D and 1H). Our CRISPR-Cas9 strategy was designed to ensure the integration of a single barcode into the UPRT locus by simultaneously deleting the protospacer and PAM motifs recognized by the CRISPR gRNA. We hypothesized that a barcoded library of parasites could be generated by a single transfection with the Cas9-gRNA plasmid and a pooled library of oligonucleotide repair templates using a widely available cuvette-based transfection apparatus ( Figure 2A). This ''one-pot'' method was tested on type I RHDku80 and type II Pru-Dku80 parasite strains in parallel, using the same mixed pool of 96 barcode oligo repair templates. FUDR-resistant parasite populations were enriched, genomic DNA isolated, UPRT locus amplicons prepared, and NGS libraries sequenced. All 96 barcodes were identified in the sequence reads generated from the UPRT locus amplicon ( Figure 2B). When the distribution of barcodes in RHDku80 and PruDku80 were compared directly, the two independently transfected parasite strains exhibited correlated frequency distributions for all but the least abundant barcodes, with a Pearson correlation coefficient of 0.7 ( Figure 2B). We confirmed the technical reproducibility of NGS runs by comparing technical replicates of indexed DNA amplicons (Figure S2A), consistent with previous reports (Robinson et al., 2014). We concluded that differences in barcode frequency most likely reflect minor variations in the relative abundance of each barcode oligonucleotide template within the pool used for transfection, rather than a negative impact of some barcodes on parasite fitness. Supporting this notion, the population was remarkably stable after 28 days of growth in vitro ( Figure S2B), indicating that there were no subtle fitness defects associated with the integration of specific barcodes, with low-frequency (E) Individual barcodes can be detected at low frequencies, and read depth is a sensitive proxy for parasite number. Scatterplot presents the number of reads for a serial 2-fold dilution series of known numbers of parasites and the relative frequency of individual barcodes in the population. The shaded area indicates the data used to calculate the PCC provided: r = 0.9954, n = 9, p < 0.0001 (two-tailed). Report ll OPEN ACCESS barcodes stably maintained. Stochastic variation was therefore the likely driver of the limited variation observed. It is notable that a higher level of biological reproducibility was observed for one-pot library transfections than the original multiplexed transfection approach (Figure S1C). We next tested the one-pot strategy in T. brucei. We were similarly able to produce complex 96-member libraries of uniquely barcoded strains from a single transfection ( Figure 2C). The one-pot transfection protocol was used for all subsequent experiments.

Cellular barcodes reveal the population structure of a T. gondii infection in vivo
The transition from acute to chronic infection corresponds with the spatial redistribution of T. gondii to skeletal muscle and the CNS (Wohlfert et al., 2017). This is accompanied by tachyzoite differentiation into bradyzoites cysts, which are necessary for parasite transmission (Barrett et al., 2019). The restrictive nature of the BBB is well documented in other infection models (Profaci et al., 2020). We hypothesized that these spatial, temporal, and developmental transitions would impose bottlenecks upon the parasite population represented in the brain at chronic infection.
We sought to test this by infecting murine hosts intraperitoneally with the pooled library of PruDku80 parasites stably expressing 96 neutral barcode alleles.
First, we confirmed that diluting the PruDku80 library to a dose tolerated by mice (and therefore appropriate for achieving a chronic infection) did not disrupt the population structure in a way that could confound the interpretation of in vivo experiments. Three inoculum samples of 37,000 viable tachyzoites (determined by plaque assay) were plated on HFFs and reexpanded in tissue culture followed by DNA isolation for NGS sequencing. In each inoculum sample, all 96 barcodes were detected by NGS sequencing, and pairwise comparisons of each sample were strongly correlated ( Figures 2D-2F), confirming that in vitro expansion of in vivo samples would minimally affect library composition.
An in vivo experiment was conceived to study the effect of murine host colonization. This would probe the within-host population genetics of the T. gondii infection as it progressed from the initial acute infection in the peritoneum to the chronic infection in the brain ( Figure 3A). To determine if the parasite population was stable early in acute infection, parasites were isolated from the (D-F) One-pot transfections using PruDku80 were diluted to a founder population of 37,000 parasites and re-expanded in HFFs prior to NGS. All 96 barcodes were identified in each sample, and the relative percentage frequency of barcodes was highly correlated in pairwise comparison for each inoculum sample: (D) inoculum 1 versus 2, PCC r = 0.98, n = 96, p < 0.0001 (two-tailed); (E) inoculum 1 versus 3, PCC r = 0.98, n = 96, p < 0.0001 (two-tailed); (F) inoculum 2 versus 3, PCC r = 0.99, n = 96, p < 0.0001 (two-tailed). PCC values provided on scatter plots indicate degree of correlation between populations being compared. See also Figures S1 and S2.
Cell Reports Methods 2, 100274, August 22, 2022 5 Report ll OPEN ACCESS peritoneal cavity of some mice at 48 h post-infection and expanded in vitro. Although barcode extinction was rare (one barcode in a single mouse), the relative frequency of barcodes in each peritoneal isolate was unique to each animal ( Figures 3B and 3C). This observation supports a model in which the initial selective sweep at the onset of acute infection is stochastic and emphasized the need to consider each host organism as a unique environment. After 28 days, the brains of chronically infected mice were isolated, and parasites expanded in vitro. Unexpectedly, the majority of barcodes were detected in each host brain ( Figure 3D; Table S1), consistent with a minimal founder effect on the genetic diversity of the parasite population colonizing the CNS. The cumulative extinction frequency across all 14 mice was 0.007 (10 of 1,344 total barcodes), and lost barcodes were most often represented at low abundance within the inoculum (Table S3). Of note, these extinction events were observed in the NGS runs with the lowest total read counts, suggesting that extinctions could be due to reduced sequence sampling depth rather than a true absence of the specific barcoded parasite(s) in the brain (Table S3). Together, these data indicate A B C D Figure 3. The murine host brain is permissively colonized by T. gondii (A) Schematic of an experiment to identify changes in the T. gondii population structure over the course of a 28-day in vivo infection. CBA/J mice were infected with an inoculum of 37,000 tachyzoites injected intraperitoneally. The infectious population structure was monitored in peritoneal exudates at 48 h post-infection (early acute) or in the brain at the onset of the chronic phase (28 days post-infection). Parasites were re-expanded in tissue culture prior to NGS sequencing. (B-D) Scatterplot represent individual barcode frequencies in each sample relative to the mean of the inoculum in the inoculum replicate (B) n = 3 replicates, the peritoneal cavity at 48 h (C) n = 3 mice, and the brain at 28 days (D); n = 14 mice. Barcode extinctions (ex.) are indicated below the x axis in the corresponding position to the absent barcoded strain (see also Report ll OPEN ACCESS that any selection bottleneck experienced by parasites during the colonization of the brain niche must be broad. Stochastic selection and host genotype shape the T. gondii population structure colonizing the brain We next sought to determine how host genetic background influences the dynamics of parasite brain colonization over time. To do this we infected Swiss Webster mice, which are outbred to maximize genetic diversity and heterozygosity, and inbred CBA/J mice ( Figure 4A) (Watson and Davis, 2019). Mean parasite cyst burden was not significantly different between CBA/J mice and Swiss Webster mice at one month post-infection, indicating that a similar number of parasites were accessing the brain niche ( Figure 4B). As expected, cyst burden declined over time, but the reduction in mean cyst number was similar between mouse genotypes (McGovern et al., 2020;Watson and Davis, 2019).
To evaluate the parasite population structure within these hosts, we used Cavalli Sforza's chord distance calculation (Cavalli-Sforza and Edwards, 1967). Chord distance is a geometric measure of population distance. It provides a measurement of genetic divergence between two populations expressed between zero and one, with ''zero'' indicating that the genetic structures of the two populations are identical and ''one'' indicating maximum genetic divergence. Although chord distance does not provide absolute quantification of the width of any bottleneck, it can be used to infer changes in the genetic structure of a population between points A and B. In CBA/J mice ( Figure 4C) there was minimal distance separating the inoculum from the peritoneal exudate population 72 h post-infection. A significant increase in distance was observed between the inoculum and the one-month brain samples. Consistent with the decline in cyst number over time, chord distance was greater at three months post-infection in CBA/J brain. Outbred Swiss Webster mice also exhibited little shift in chord distance from the inoculum to the peritoneal exudate and a significant shift in distance from the inoculum to the one-month post-infection brain samples ( Figure 4D). However, the individual Swiss mice exhibited greater variation in parasite population structure at each chronic time point than CBA/J mice. Unlike CBA/J mice, Swiss mice supported parasite populations with a similar median chord distance between inoculum and one-or three-month post-infection brains.
Cumulatively, our data indicated that following an i.p. infection the murine brain was unexpectedly permissive to colonization ( Figure 3D), and parasite population structure was dynamic within individual hosts ( Figures 4C and 4D). In CBA/J mice, the genetic distance separating the brain niche parasite population from the inoculum was greater at three months versus one month post-infection ( Figure 4C). With this in mind, we analyzed our data to understand if the barcoded lineages that were the most abundant in the inoculum had a competitive advantage for long-term persistence. In CBA/J mice, dominant barcode alleles at three months post-infection ( Figure 4E, top row) corresponded to parasite populations that were among the top 35% of reads (top six most abundant barcodes) in the inoculum. Although this trend was similar at one month post-infection, several CBA/J mice harbored a dominant barcode lineage represented in the bottom 25% of reads in the inoculum ( Figure 4E outset in the inoculum pie chart). In agreement with the chord distance analysis, individual Swiss mice had diverse dominant population structures at three months post-infection ( Figure 4E, bottom row). Although dominant barcodes in the Swiss brains were typically derived from lineages represented within the top 75% of reads in the inoculum ( Figure 4E, most abundant 15 barcodes), this was not always the case. For example, one mouse was dominated by a barcode that was present at low frequency in the inoculum, while another mouse exhibited a population structure in which barcode frequency was more evenly distributed than in the inoculum or peritoneal parasite isolates. This heterogeneity was also reflected in the barcodes that dominated Swiss brains at one month post-infection, suggesting that even relatively low-frequency strains can establish and maintain persistent infection in a manner that likely depends on stochastic variables and host genotype.

DISCUSSION
Brain residency has long been appreciated as a strategy for pathogenesis exploited by T. gondii to evade sterilizing immunity, as well as an evolutionary strategy to increases the likelihood of transmission via predation. Feline consumption of high-fat, energy-rich infected neuronal tissue of prey, including mice, provides T. gondii with a route back into the definitive feline host (Dubey, 1997;Plantinga et al., 2011). A means to evade any restrictive bottleneck when colonizing the brain niche would be consistent with this evolutionary strategy, supporting maximal transmission of genetic diversity into the feline host to contribute to recombination in the subsequent sexual cycle. Additionally, T. gondii can infect most warm-blooded animals, indicating that intermediate host-specific selection environments will frequently and unpredictably change, favoring different phenotypes at different times. It therefore likely benefits the parasite to maintain genetic diversity and phenotypic plasticity to ensure survival in varied future host species.
A key role of tissue barriers is to restrict pathogen access to the underlying tissue niche (Kim, 2008). Unexpectedly our findings imply multiple unique T. gondii colonization events from circulation into the brain parenchyma, rather than rare brain invasions coupled with expansion within that niche. During intraperitoneal infection, the loss of barcodes is stochastic, and most barcodes are identified in the brain parenchyma at one month post-infection (Figure 3). Chord distance indicated a change in the genetic structure of the parasite population in the CNS at one month post-infection ( Figures 4C and 4D). In CBA/J mice and Swiss mice, the dominant barcodes in the brain following intraperitoneal infection tended to be those most frequently observed in the inoculum, but most barcodes were still detected, indicating limited selective pressure imparted by the BBB. There were exceptions to this, particularly in outbred Swiss mice, in which minor constituents of the inoculum were able to dominate the brain population in some individuals. Although this finding is intriguing, a greater number of infections (requiring more complex statistical methods) will need to be done before definitive conclusions can be made. It is also important to acknowledge that our data represent the sum of two differentiations: tachyzoite-to-bradyzoite in vivo, followed by bradyzoite-to-tachyzoite in vitro.
Cell Reports Methods 2, 100274, August 22, 2022 7 Report ll OPEN ACCESS We used intraperitoneal infection because it facilitates a precise quantification of barcode frequency and coverage across the inoculum dose. The inoculum dose was chosen to achieve at least 1003 coverage of all barcodes, thereby accounting for possible differences in viability of tachyzoites in vitro versus in vivo. This is not possible when using bradyzoite cysts, as each cyst, which must be harvested from infected mice, has a variable parasite number. In the future, it will be interesting to Significance relative to inoculum was tested using Kruskal-Wallis one-way ANOVA, with the Mann-Whitney test used for pairwise comparisons.
(E) Parts-of-whole charts representing the relative frequency of each barcode within the mouse samples. Barcodes are ranked in descending order of abundance in the inoculum mean (inoculum mean n = 3, represented as a pie chart and bar charts labeled ''inoc.''). A color was assigned to a barcode if it was the dominant barcode in any brain sample or if it represented greater than 20% of all reads in any brain sample. Each color is unique to one barcode. Any barcode that dominated a mouse brain and was represented in the bottom 25% of inoculum reads is outset in the inoculum pie chart for clarity and shaded brown, gray, or black in parts-of-whole charts. investigate how host colonization following oral infection shapes the population structure of the parasite, and how it differs from the intraperitoneal route. It should also be noted that the results of our study do not mean that there is no bottleneck for infection of the brain niche, simply that with the number of barcode markers used in this study, we were not able to determine the bottleneck width. On the basis that all 96 barcodes were identified within the brains of the infected hosts, we estimate the minimum size of the bottleneck to be at least 96 parasites.
The observations made in this study would have not been possible without a means to barcode this eukaryotic pathogen. Cellular barcoding has provided critical insights into the infection biology of viruses such as poliovirus (Kuss et al., 2008;Pfeiffer and Kirkegaard, 2006) and bacteria such as Salmonella (Grant et al., 2008;Kaiser et al., 2013;Lam and Monack, 2014;Lim et al., 2014) and Escherichia coli . We anticipate that barcoded T. gondii strains will have similar far-reaching applications. Although barcode sequencing (barseq) strategies have been widely used for eukaryotic pathogen phenotypic screens (Alsford et al., 2011;Baker et al., 2021;Beneke et al., 2019;Beneke and Gluenz, 2020;Bushell et al., 2017;Sidik et al., 2016), to date the use of cellular barcodes to study the within-host infectious population structure of wild-type eukaryotic pathogens has been limited. A notable example being the use of eight uniquely barcoded T. brucei strains to study colonization of the tsetse fly and subsequent survival of these strains within the bloodstream of the murine host (Oberle et al., 2010). The increased number of cellular barcodes afforded by our simple approach will provide an even greater opportunity for T. brucei researchers to interrogate colonization the host organism, such as the colonization of subcutaneous fat and bone marrow (Mugnier et al., 2015).
The versatility of the strategies presented will allow researchers to work with individual barcoded strains in isolation prior to pooling (via plate-based library generation) or with complex libraries of barcoded strains generated through our one-pot approach. We believe that this will expand the application of our approach beyond the population genetic studies described herein. For example, barcoded cancer cell lines have been successfully used to multiplex an in vivo drug screen (Gruner et al., 2016). Importantly, this allowed the authors to chemically interrogate metastatic seeding, a unique feature of the in vivo disease that cannot be accurately represented in vitro. It should be noted that disruption of the UPRT locus has been documented to negatively affect cyst burden, which would suggest that our data might underrepresent barcode diversity in the CNS. However, all barcodes are inserted at the same position, so the cross-comparison of populations is internally controlled. Our simple oligo barcoding strategy can be applied to systems accessible to CRISPR that contains endogenous or engineered negative selection markers, which we successfully demonstrated with T. brucei. In principle, our approach allows parallel integration of even greater numbers of unique barcodes, which will increase precision in future host-pathogen population genetic studies. An increased number of cellular barcodes combined with methods such as sequence tag analysis of microbial populations (STAMP) (Abel et al., 2015) or the recently published update STAMPR  will make it possible to quantify the absolute founder population number present within the host brain. Combined with these methods, our cellular barcoding approach will allow researchers to probe the within-host population genetics of tissue colonization during a T. gondii infection with an unprecedented degree of resolution.
Limitations of the study Our methodology currently requires that bradyzoite cysts spontaneously re-differentiate in vitro following isolation from the host. This second stage conversion is required to expand parasite numbers to sufficient quantities for our NGS workflow, but it should be noted that it might influence the frequency distribution of barcodes in the final population.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Matthew Child (m.child@imperial.ac.uk).

Materials availability
This study did not generate new unique reagents.
Data and code availability d All next-generation sequencing data have been deposited at NCBI Sequence Read Archive and are publicly available as of the date of publication. Fastq files for the bar-seq analysis and raw counts in this study have been deposited in the Short Read d Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS
Toxoplasma gondii cell culture T. gondii parasite strains were maintained by serial passage in confluent human foreskin fibroblasts (HFF-1 ATCCâ SCRC-1041 TM ). HFFs were cultured at 37 C with 5% CO2 in Dulbecco's Modified Eagle's medium supplemented with 10% foetal bovine serum and 2 mM L-glutamine. Tachyzoites were harvested via mechanical syringe lysis of heavily infected HFFs through a 25-gauge needle. RHDku80 parasites were used for in vivo and in vitro studies. PruDku80 parasites were used in in vivo experiments where chronic infections were established.

Animal work
Six-week old female C57BL/6 or CBA/J mice were purchased from Jackson Laboratories. Mice were acclimated for seven days prior to infection. Six-to eight-week old female Swiss Webster mice (originally form Jackson Laboratories) were obtained from the University of Virginia Centre for Comparative Medicine foster and sentinel colony. For studies using CBA/J and Swiss Webster mice, the animal protocols were approved by the University of Virginia Institutional Animal Care and Use Committee (protocol # 4107-12-18). All animals were housed and treated in accordance with AAALAC and IACUC guidelines at the University of Virginia Veterinary Centre for Comparative Medicine. The procedures involving C57BL/6 (multiplex experiment) mice were approved by the local ethical committee of the Francis Crick Institute Ltd, Mill Hill Laboratory and are part of a project license approved by the Home Office, UK, under the Animals (Scientific Procedures) Act 4001986.

Trypanosome brucei brucei cell culture
We used a bloodstream form Lister 427 strain T. brucei brucei for all experiments. The cells were cultured at 37 C, 5% CO2 in HMI-9 medium supplemented with 20% heat inactivated Foetal Bovine Serum, 100 U/ml penicillin and 100 mg/ml streptomycin (Gibco).

METHOD DETAILS
Generation of barcoded Toxoplasma gondii Generation of barcoded T. gondii strains and libraries: 60-nucleotide single-stranded oligos were designed to include a unique six nucleotide barcode sequence flanked by a stop codon and homology regions on either side. Barcodes were designed using the DNA barcode designer and decoder, nxcode (http://hannonlab.cshl.edu/nxCode/nxCode/main.html). The sequences of all oligos within the 96-member library can be found in key resources table. Barcoded libraries of tachyzoites were generated using two alternative strategies: For strategy A, 96 independent transfections were carried out in 16 well Nucleocuvette strips. 10 mg of the pSAG1::Cas9-U6::sgUPRT vector and 10 mg of the barcode oligo (equivalent to an$1:160molar ratio of plasmid to oligo) were cotransfected into approximately 1310 6 extracellular tachyzoites using the 4D-Nucleofector X Unit programme F1-115 (Lonza). 24 hours post-transfection, transgenic barcoded parasites were selected for using 5 mM 5'-fluro-2'-deoxyuridine (FUDR). Barcoded strains were independently maintained, and only pooled just prior to use. For strategy B, a single ''one-pot'' transfection was carried out. An oligo library pool containing roughly equal amounts of all barcode oligos was prepared. The ratio of the pSAG1::Cas9-U6::sgUPRT vector to the total oligo pool was the same as in strategy A, though here the final concentration of any single oligo within the pool was$100-fold less. Transfection and selection were performed as for A, with the complex barcoded strain library generated and maintained as a single population.
Generation of barcoded Trypanosome brucei brucei 60-nucleotide double-stranded oligos were designed to include a unique six nucleotide barcode sequence (bold, upper case) flanked by a stop codon (lower case) and 24 bp homology regions on either side: TGCAATCAGAAGACGAGGTTTAAGtagAAACACtgaCTCACACTAACCGTTTCGATTTAC. Amino acid transporter AAT6 (Tb927.8.5450) was selected as a suitable locus for barcode integration. DNA encoding the sgRNA sequence targeting the AAT6 locus (GTTTAAGTTCACATTGTCGC) was generated by PCR as described in (Rico et al., 2018), ethanol precipitated, and 10 mg was mixed with 10 ng pre-annealed oligonucleotides. The mixture (20 ml total volume) was added to $10 7 Cas9 expressing bloodstream form Lister 427 T. b. brucei cells in 100 ml Amaxa buffer and electroporated using Amaxa Nucleofector IIb (Lonza) program X-001. Transfected cells were immediately added to pre-warmed HMI-9 medium containing 270 mM eflornithine. Transfected cell cultures were passaged under selection every two days. Drug resistant parasites were harvested seven days after transfection for genomic DNA isolation. PCR amplicons encompassing the barcoded region of the AAT6 locus (from four independent transfections) were generated using ORF-specific PCR primer sequences (5 0 to 3 0 ) ATGAGAGAGCCGATACAAACTTCAAC and Cell Reports Methods 2, 100274, August 22, 2022 e2 Report ll OPEN ACCESS TCAGAGTTCAGCAATGACGCTG. Barcode integration was confirmed by Sanger sequencing of these amplicons. For the one-pot transfection strategy, complementary single-stranded barcoding oligos were annealed to produce 96 unique double stranded barcoding repair templates. Annealed barcoding oligos were then pooled, and used in a single transfection as described above NGS library preparation Frozen cell pellets of parasites were thawed to room temperature and genomic DNA extracted using the DNeasy Blood & Tissue Kit (Qiagen). Genomic DNA libraries were prepared following the 16S Metagenomic Sequencing Library Preparation guide (Illumina). In brief, an$300 base pair amplicon region encompassing the 6 nt barcode sequence was amplified (30 cycles) from the barcoded UPRT locus in Toxoplasma gondii using primer sequences (5 0 to 3 0 ) TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGtggatgtgtcataccatggagtttcctg and GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGtgttttagtgtaacaaagtggacagcagc.
These primer sequences include the specified Illumina adapter overhang sequences (bold, uppercase). AMPure XP beads were used to purify the resulting PCR product. An indexing PCR (10 cycles) was carried using the purified product as the template to add dual indices and sequencing adapters to the amplicon using the Nextera XT Index Kit (Illumina). Indexed libraries were then cleaned using AMPure XP beads and quantified on the Quantus Fluorometer using the QuantiFluor ONE dsDNA System (Promega). Amplicons were purity-checked and sized on a TapeStation using D1000 ScreenTape System (Agilent). For each NGS run, typically 8 to 25 uniquely indexed libraries were pooled at equimolar concentrations for multiplexed outputs on either an Illumina MiSeq or NextSeq sequencer using the MiSeqV3 PE 75 bp kit or NextSeq 500/550 Mid Output v2.5 PE 75 bp kit respectively. PhiX DNA spike-in of 20% was used in all NGS runs. Following acquisition, sequencing data was demultiplexed and total sample reads extracted from fastq files using the Galaxy web platform (www.usegalaxy.org). Within Galaxy, sequencing reads were concatenated, trimmed, and split into the respective barcodes. Phred QC scores for all NGS runs were >30 with the exception of a single run used for analysis of technical and biological replicates, which still gave an acceptable score of 28. Following trimming to the appropriate 6 nt region a stringent barcode mismatch tolerance of 0% was applied, typically resulting in 10-15% of total reads being discarded. Barcode read data was analysed using Prism 8 and correlations coefficients calculated within the software using Pearson analysis. Testing the sensitivity of the NGS pipeline ( Figure 1E), a 96 well plate of 96 uniquely barcoded strains was set up (using transfected drug resistant parasites), with 10,000 parasites/well. For the plate (12 columns x 8 rows), a serial two-fold dilution was performed across the 12 columns, and all rows in the final column pooled after the final dilution. Genomic DNA was then prepared from the final pooled sample and processed for NGS as described.

In vivo experiments
For intraperitoneal infection the pooled barcode parasite library was expanded on HFFs in a T175 flask. Once full parasite vacuoles were observed, parasites were scraped and syringe lysed, counted on a haemocytometer and diluted to an inoculum of 37,000 viable parasites (data represented in Figure 3) or 12,000 viable parasites (data represented in Figure 4) in 200 mL of PBS per mouse. The numbers of viable parasites in the IP infection inoculums were determined by plaque assay. At the time of inoculation 2310 6 parasites were frozen as an initial population control. In addition, three inoculum control samples were expanded immediately on HFF T25 flasks. After 48 or 72 hours three to five mice were euthanized to isolate parasites in the peritoneal exudate. Specifically, 10 mL of PBS was injected by 25G needle into the peritoneal cavity, mice were rocked vigorously, and peritoneal fluid removed by syringe. Parasites and exudate cells were washed twice in 10 mL of media containing penicillin/streptomycin, pelleted at 1,500 rpm and plated on HFFs T25 flasks. Parasites were harvested when they approached full lysis of the monolayer pelleted and frozen for genomic DNA isolation. After 28 days or three months, the remaining mice were euthanized. Carcasses were incubated in 20% bleach for 10 minutes and the brain was excised in the biosafety cabinet under sterile conditions. To isolate parasites the brains were mashed though a 70 mm filter using 25mL PBS with 5% FBS and penicillin/streptomycin. Brain mash was pelleted for 10 minutes at 1,500 rpm, washed twice with PBS and penicillin/streptomycin then plated on HFF monolayers in T75 flasks. After 36 hours, media was changed to remove debris. Parasites were harvested by syringe lysis when the HFF monolayer was nearly lysed out (approximately two weeks), pelleted and frozen for genomic DNA isolation. To confirm cyst formation in the brain at one month (28 days) or three months post infection,1/50th of the mash was reserved, fixed in 4% paraformaldehyde for 15 minutes then stained with a 1:500 dilution of dolichos bifluorus agglutinin conjugated to FITC in PBS (Vector Labs). FITC-positive cysts were confirmed by fluorescence and morphology under 20x magnification and the total cyst burden per brain was back-calculated.

QUANTIFICATION AND STATISTICAL ANALYSIS
Bottleneck analysis and chord distance calculations Genetic selection bottlenecks experienced within the murine host were estimated by calculating changes in the relative frequencies of barcodes within dynamic T. gondii populations in relation to the starting population in the inoculum. The following equations 26 were used to calculate chord distance:  Table S1: Details of barcode extinctions occurring during the different phases of infection. Barcode numbers and the mouse in which the extinction was observed are noted as shown in Figure 2C-D. Extinctions are defined by an absence of the barcode sequence within the processed NGS read data. Related to Figure 3.   Figures 1 and 2. A) A single ~300 bp region of the UPRT locus was amplified from genomic DNA and purified (amplicon, lanes 1 and 2). The purified amplicon was then indexed and re-purified prior to quantification and sizing (indexed amplicon, lanes 3 and 4). B) Strategy to isolate the primary source of variation to the multiplexed transfection barcoding strategy or the NGS pipeline. C-E) Scatter plots show the percent representation of individual barcodes within library pools for biological replicate transfections, (C, PCC r = 0.0425, n = 96, P (two-tailed) = 0.6807), or technical replicates of amplicon indexing (D, PCC r = 0.9744, n = 96, P (two-tailed) = <0.0001) and (E, PCC r = 0.9952, n = 96, P (twotailed) = <0.0001). PCC values represent comparison between samples indicated on each x and y axis.