Global diversity and antimicrobial resistance of typhoid fever pathogens: Insights from a meta-analysis of 13,000 Salmonella Typhi genomes

Background: The Global Typhoid Genomics Consortium was established to bring together the typhoid research community to aggregate and analyse Salmonella enterica serovar Typhi (Typhi) genomic data to inform public health action. This analysis, which marks 22 years since the publication of the first Typhi genome, represents the largest Typhi genome sequence collection to date (n=13,000). Methods: This is a meta-analysis of global genotype and antimicrobial resistance (AMR) determinants extracted from previously sequenced genome data and analysed using consistent methods implemented in open analysis platforms GenoTyphi and Pathogenwatch. Results: Compared with previous global snapshots, the data highlight that genotype 4.3.1 (H58) has not spread beyond Asia and Eastern/Southern Africa; in other regions, distinct genotypes dominate and have independently evolved AMR. Data gaps remain in many parts of the world, and we show the potential of travel-associated sequences to provide informal ‘sentinel’ surveillance for such locations. The data indicate that ciprofloxacin non-susceptibility (>1 resistance determinant) is widespread across geographies and genotypes, with high-level ciprofloxacin resistance (≥3 determinants) reaching 20% prevalence in South Asia. Extensively drug-resistant (XDR) typhoid has become dominant in Pakistan (70% in 2020) but has not yet become established elsewhere. Ceftriaxone resistance has emerged in eight non-XDR genotypes, including a ciprofloxacin-resistant lineage (4.3.1.2.1) in India. Azithromycin resistance mutations were detected at low prevalence in South Asia, including in two common ciprofloxacin-resistant genotypes. Conclusions: The consortium’s aim is to encourage continued data sharing and collaboration to monitor the emergence and global spread of AMR Typhi, and to inform decision-making around the introduction of typhoid conjugate vaccines (TCVs) and other prevention and control strategies. Funding: No specific funding was awarded for this meta-analysis. Coordinators were supported by fellowships from the European Union (ZAD received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 845681), the Wellcome Trust (SB, Wellcome Trust Senior Fellowship), and the National Health and Medical Research Council (DJI is supported by an NHMRC Investigator Grant [GNT1195210]).


Introduction
Salmonella enterica serovar Typhi (Typhi) causes typhoid fever, a predominantly acute bloodstream infection associated with fever, headache, malaise, and other constitutional symptoms. If not treated appropriately, typhoid fever can be fatal; mortality ratios are estimated <1% today, but in the preantibiotic era ranged from 10% to 20% (Andrews et al., 2018;Stuart and Pullen, 1946). Historically, the disease was responsible for large-scale epidemics, triggered by the unsanitary conditions created during rapid urbanisation. Typhoid fever has since been largely controlled in many parts of the world due to large-scale improvements in water, sanitation, and hygiene (WASH) (Cutler and Miller, 2005), but was still responsible for an estimated 10.9 million illnesses and 116,800 deaths worldwide in 2017, largely in parts of the world where WASH is suboptimal (GBD 2017 Typhoid andParatyphoid Collaborators, 2019). Antimicrobial therapy has been the mainstay of typhoid control, but multidrug resistance (MDR, defined as combined resistance to ampicillin, chloramphenicol, and co-trimoxazole) emerged in the 1970s, and resistance to newer drugs including fluoroquinolones, third-generation cephalosporins, and azithromycin has been accumulating over the last few decades (Marchello et al., 2020).
In 2001, the first completed whole genome sequence of Typhi was published (Parkhill et al., 2001). The sequenced isolate was CT18, an MDR isolate cultured from a typhoid fever patient in the Mekong Delta region of Vietnam in 1993. The genome was the result of 2 years of work piecing together plasmid-cloned paired-end sequence reads generated by Sanger capillary sequencing. Together with other early bacterial pathogen genomes, including a second Typhi genome (Ty2) published 2 years later in 2003 (Deng et al., 2003), the CT18 genome was heralded as a major turning point in the potential for disease control, treatment, and diagnostics, providing new tools for epidemiology, molecular microbiology, and bioinformatics. It formed the basis for new insights into comparative and functional genomics (Boyd et al., 2003;Faucher et al., 2006), and facilitated early genotyping efforts Roumagnac et al., 2006). When high-throughput sequencing technologies such as 454 and Solexa (subsequently Illumina) emerged, Typhi was an obvious first target for in-depth characterisation of a single pathogen population , and genomics has been increasingly exploited to describe the true population structure and global expansion of this highly clonal pathogen . Now, whole genome sequencing (WGS) is becoming a more routine component of typhoid surveillance. Salmonella were among the first pathogens to transition to routine sequencing by public health laboratories in high-income countries Stevens et al., 2022), and these systems often capture Typhi isolated from travel-associated typhoid infections, providing an informal mechanism for sentinel genomic surveillance of pathogen populations in typhoid endemic countries (Ingle et al., 2019). More recently, WGS has been adopted for typhoid surveillance by national reference laboratories in endemic countries including the Philippines, Nigeria (Okeke et al., 2022), and South Africa (Lagrada et al., 2022), and PulseNet International is gradually transitioning to WGS (Davedow et al., 2022;Nadon et al., 2017). Following the first global genomic snapshot study, which included nearly 2000 genomes of Typhi isolated from numerous typhoid prevalence and incidence studies conducted across Asia and Africa , WGS has become the standard tool for characterising clinical isolates. Given the very high concordance between antimicrobial susceptibility to clinically relevant drugs and known genetic determinants of antimicrobial resistance (AMR) in Typhi (Argimón et al., 2021a;Chattaway et al., 2021;da Silva et al., 2022), WGS is also increasingly used to infer resistance patterns.
The adoption of WGS for surveillance relies on the definition of a genetic framework with linked standardised nomenclature, often supplied by multilocus sequence typing (MLST) and core genome multilocus sequence typing (cgMLST) for clonal pathogens. Typhi evolves on the order of 0.5 substitutions per year, much more slowly than host-generalist Salmonella, such as S. enterica serovars Kentucky and Agona (five substitutions per year) (Achtman et al., 2021;Duchêne et al., 2016). As a result, the cgMLST approach, which utilises 3002 core genes (Zhou et al., 2020) (two-thirds of the genome) and is popular with public health laboratories for analysis of non-typhoidal S. enterica, has limited utility for Typhi. Instead, most analyses rely on identifying single nucleotide variants (SNVs) and using these to generate phylogenies. This approach allows for fine-scale analysis of transmission dynamics (although not resolving individual transmission events, due to the slow mutation rate; Campbell et al., 2018) and tracking the emergence and dissemination of AMR lineages (Klemm et al., 2018;da Silva et al., 2022;Wong et al., 2015). In the absence of a nomenclature system such as that provided by cgMLST, an alternative strategy was needed for identifying and naming lineages. eLife digest Salmonella Typhi (Typhi) is a type of bacteria that causes typhoid fever. More than 110,000 people die from this disease each year, predominantly in areas of sub-Saharan Africa and South Asia with limited access to safe water and sanitation. Clinicians use antibiotics to treat typhoid fever, but scientists worry that the spread of antimicrobial-resistant Typhi could render the drugs ineffective, leading to increased typhoid fever mortality.
The World Health Organization has prequalified two vaccines that are highly effective in preventing typhoid fever and may also help limit the emergence and spread of resistant Typhi. In low resource settings, public health officials must make difficult trade-off decisions about which new vaccines to introduce into already crowded immunization schedules. Understanding the local burden of antimicrobial-resistant Typhi and how it is spreading could help inform their actions.
The Global Typhoid Genomics Consortium analyzed 13,000 Typhi genomes from 110 countries to provide a global overview of genetic diversity and antimicrobial-resistant patterns. The analysis showed great genetic diversity of the different strains between countries and regions. For example, the H58 Typhi variant, which is often drug-resistant, has spread rapidly through Asia and Eastern and Southern Africa, but is less common in other regions. However, distinct strains of other drug-resistant Typhi have emerged in other parts of the world.
Resistance to the antibiotic ciprofloxacin was widespread and accounted for over 85% of cases in South Africa. Around 70% of Typhi from Pakistan were extensively drug-resistant in 2020, but these hard-to-treat variants have not yet become established elsewhere. Variants that are resistant to both ciprofloxacin and ceftriaxone have been identified, and azithromycin resistance has also appeared in several different variants across South Asia.
The Consortium's analyses provide valuable insights into the global distribution and transmission patterns of drug-resistant Typhi. Limited genetic data were available fromseveral regions, but data from travel-associated cases helped fill some regional gaps. These findings may help serve as a starting point for collective sharing and analyses of genetic data to inform local public health action. Funders need to provide ongoing supportto help fill global surveillance data gaps.
To address this challenge, a genotyping framework ('GenoTyphi') was developed that uses marker SNVs to assign Typhi genomes to phylogenetic clades and subclades , similar to the strategy that has been widely adopted for Mycobacterium tuberculosis (Coll et al., 2014). The GenoTyphi scheme was initially developed based on an analysis of almost 2000 Typhi isolates from 63 countries . This dataset was used to define a global population framework based on 68 marker SNVs, which were used to define 4 primary clades, 15 clades, and 49 subclades organised into a pseudo-hierarchical framework. This analysis demonstrated that most of the global Typhi population was highly structured and included many subclades that were geographically restricted, with the exception of Haplotype 58, or H58 (so named by Roumagnac et al., 2006, and designated as genotype 4.3.1 under the GenoTyphi scheme). H58 (genotype 4.3.1) was strongly associated with AMR and was found throughout Asia as well as Eastern and Southern Africa . The GenoTyphi framework has evolved and expanded to reflect changes in global population structure and the emergence of additional AMR-associated lineages , and has been widely adopted by the research and public health communities for the reporting of Typhi WGS data Ingle et al., 2021;da Silva et al., 2022). The genotyping framework, together with functionality for identifying AMR determinants and plasmid replicons, and generating clustering-based trees, is available within the online genomic epidemiology platform Typhi Pathogenwatch (Argimón et al., 2021b). This system is designed to facilitate genomic surveillance and outbreak analysis for Typhi, including contextualisation with global public data, by public health and research laboratories (Argimón et al., 2021a;Ikhimiukor et al., 2022a;Lagrada et al., 2022) without requiring major investment in computational infrastructure or specialist bioinformatics training.
The increasing prevalence of AMR poses a major threat to effective typhoid fever control. The introduction of new antimicrobials to treat typhoid fever has been closely followed by the development of resistance, beginning with widespread chloramphenicol resistance in the early 1970s (Anderson, 1975;Andrews et al., 2018). By the late 1980s, MDR typhoid had become common. The genetic basis for MDR was a conjugative (i.e. self-transmissible) plasmid of incompatibility type IncHI1 (Anderson, 1975), which was first sequenced as part of the Typhi str. CT18 genome in 2001 (Parkhill et al., 2001). This plasmid accumulated genes (bla TEM-1 , cat, dfr, and sul) encoding resistance to all three first-line drugs, mobilised by nested transposons (Tn6029 in Tn21, in Tn9) (Holt et al., 2011b;Wong et al., 2015). The earliest known H58 isolates were MDR, and it has been proposed that selection for MDR drove the emergence and dissemination of H58 (Holt et al., 2011b), which is estimated to have originated in South Asia in the mid-1980s da Silva et al., 2022;Wong et al., 2015) before spreading throughout South East Asia (Holt et al., 2011a;Pham Thanh et al., 2016b) and into Eastern and Southern Africa Kariuki et al., 2010;Wong et al., 2015). The MDR transposon has subsequently migrated to the Typhi chromosome on several independent occasions (Ashton et al., 2015;Wong et al., 2015), allowing for loss of the plasmid and fixation of the MDR phenotype in various lineages. Other MDR plasmids do occur in Typhi but are comparatively rare (Argimón et al., 2021b;Ingle et al., 2019;Rahman et al., 2020;Tanmoy et al., 2018;Wong et al., 2015).
The emergence of MDR Typhi led to widespread use of fluoroquinolones (mainly ciprofloxacin) as first-line therapy in typhoid fever treatment. Ciprofloxacin non-susceptibility (CipNS, defined by minimum inhibitory concentration [MIC]≥0.06 mg/L) soon emerged and became common, particularly in South and South East Asia (Chau et al., 2007;Dyson et al., 2019). The genetic basis for this is mainly substitutions in the quinolone resistance determining region (QRDR) of core chromosomal genes gyrA and parC, which directly impact fluoroquinolone binding. These substitutions have arisen in diverse Typhi strain backgrounds (estimated >80 independent emergences)  but appear to be particularly common in H58 (4.3.1) subtypes (Roumagnac et al., 2006;da Silva et al., 2022;Wong et al., 2015). The most common genetic pattern is a single QRDR mutation (typically at gyrA codon 83 or 87), which results in a moderate increase in ciprofloxacin MIC to 0.06-0.25 mg/L  and is associated with prolonged fever clearance times and increased chance of clinical failure when treating with fluoroquinolones (Pham Thanh et al., 2016a;Wain et al., 1997). An accumulation of three QRDR mutations raises ciprofloxacin MIC to 8-32 mg/L and is associated with higher occurrence of clinical failure (Pham Thanh et al., 2016a). Triple mutants appear to be rare, with the exception of a subclade of 4.3.1.2 bearing GyrA-S83F, GyrA-D87N, and ParC-S80I (designated genotype 4.3.1.2.1; Ingle et al., 2022), which emerged in India in the mid-1990s and has since been introduced into Pakistan, Nepal, Bangladesh, and Chile (Britto et al., 2020;Maes et al., 2020;da Silva et al., 2022;Pham Thanh et al., 2016a).
The accumulation of resistance to almost all therapeutic options means that there is an urgent need to track the emergence and spread of AMR Typhi, both to guide empiric therapy to prevent treatment failure (Nabarro et al., 2022), and to direct the deployment of preventative interventions like typhoid conjugate vaccines (TCVs) and WASH infrastructure. Given the wealth of existing and emerging WGS data for Typhi, we aimed to create a system to enhance visibility and accessibility of genomic data to inform current and future disease control strategies, including identifying where empiric therapy may need review, and monitoring the impact of TCVs on AMR and vaccine escape. In forming the Global Typhoid Genomics Consortium (GTGC), we aim to engage with the wider typhoid research community to aggregate Typhi genomic data and standardised metadata to facilitate the extraction of relevant insights to inform public health policy through inclusive, reproducible analysis using freely available and accessible pipelines and intuitive data visualisation. Here, we present a large, geographically representative dataset of 13,000 Typhi genomes, and provide a contemporary snapshot of the global genetic diversity in Typhi and its spectrum of AMR determinants. The establishment of the GTGC, which marked 21 years of typhoid genomics, provides a platform for future typhoid genomics activities, which we hope will inform more sophisticated disease control.

Ethical approvals
Each contributing study or surveillance programme obtained local ethical and governance approvals, as reported in the primary publication for each dataset. For this study, inclusion of data that were not yet in the public domain by August 2021 was approved by the Observational/Interventions Research Ethics Committee of the London School of Hygiene and Tropical Medicine (ref #26408), on the basis of details provided on the local ethical approvals for sample and data collection (Supplementary file 1).

Sequence data aggregation
Attempts were made to include all Typhi sequence data generated in the 20 years since the first genome was sequenced, through August 2021. Genome data and the corresponding data owners were identified from literature searches and sequence database searches (European Nucleotide Archive [ENA]; NCBI Short Read Archive [SRA], and GenBank; Enterobase). Unpublished data, including those from ongoing surveillance studies and routine public health laboratory sequencing, were identified through professional networks, published study protocols , and an open call for participation in the GTGC. All data generators thus identified were invited to join the GTGC and to provide or verify corresponding source information, with year and location isolated being required fields ('metadata', see below). Nearly all those contacted responded and are included as consortium authors on this study. The exceptions, where authors did not respond to email inquiries, were: (i) one genome reported from Malaysia (Ahmad et al., 2017) and n=133 draft genomes reported from India (Katiyar et al., 2020), which were excluded as sequence reads were not available in NCBI; and (ii) n=39 genomes reported in studies of travel-associated or local outbreaks (Burnsed et al., 2018;Hao et al., 2020;Shin et al., 2021), which were included as raw sequence data and sufficient metadata were publicly available. A further n=850 genomes sequenced by US Centers for Disease Control and Prevention and available in NCBI were excluded from analysis because travel history was unknown and most US cases are travel-associated. Table 1 summarises all studies and unpublished public health laboratory datasets from which sequence data were sourced.
Whole genome sequence data, in the form of Illumina fastq files, were sourced from the ENA or SRA or were provided directly by the data contributors in the case of data that was unpublished in August 2021. Run, BioSample, and BioProject accessions are provided in Supplementary file 2, together with contributed metadata and PubMed or preprint identifiers.

Metadata curation and variable definitions
Owners of the contributing studies were asked to provide or update source information relating to their genome data, using a standardised template (http://bit.ly/typhiMeta). Repeat isolates were defined as those that represent the same occurrence of typhoid infection (acute disease or asymptomatic carriage) as one that is already included in the dataset. In such instances, data owners were asked to indicate the 'primary' isolate (either the first, or the best quality, genome for each unique    Data provided on the source of isolates (specimen type and patient health status) are shown in Supplementary file 3. This information was used to identify isolates that were associated with acute typhoid fever. In total, n=6462 genomes were recorded as isolated from symptomatic individuals. A further n=119 were recorded as isolated from asymptomatic carriers. The remaining genomes had no health status recorded (i.e. symptomatic vs asymptomatic carrier); of these, the majority were isolated from blood (n=3365) or the specimen type was not recorded (n=2522). Since most studies and surveillance programmes are set up to capture acute infections rather than asymptomatic carriers, we defined 'assumed acute illness' genomes as those not recorded explicitly as asymptomatic carriers (n=119) or coming from gallbladder (n=1) or environmental (n=14) samples; this resulted in a total of 12,831 genomes that were assumed to represent acute illness.
We defined 'country of origin' as the country of isolation; or for travel-associated infections, the country recorded as the presumed country of infection based on travel history (Centers for Disease Control and Prevention, 2011;Ingle et al., 2021;Ingle et al., 2019;Matono et al., 2017). Countries were assigned to geographical regions using the United Nations Statistics Division standard M49 (see https://unstats.un.org/unsd/methodology/m49/overview/); we used the intermediate region label where assigned, and subregion otherwise. To identify isolate collections that were suitably representative of local pathogen populations, for the purpose of calculating genotype and AMR prevalences for a given setting, data owners were asked to indicate the purpose of sampling for each study or dataset. Options available were either 'Non Targeted' (surveillance study, routine diagnostics, reference lab, other; n=11,086), 'Targeted' (cluster investigation, AMR focused, other; n=1862), or 'Not Provided' (n=17). Only samples from 'Non Targeted' sampling frames with known year of isolation and country of origin were included in national prevalence estimates.

Genotype and AMR prevalence estimates and statistical analysis
All statistical analyses were conducted in R v4.1.2 (R Development Core Team, 2021), code is available in R markdown format at https://github.com/typhoidgenomics/TyphoidGenomicsConsortiumWG1 (v1.0, doi:10.5281/zenodo.7487862; . Genotype and AMR frequencies were calculated at the level of country and UN world region (based on 'country of origin') as defined above. Inclusion criteria for these estimates were: known 'country of origin', known year of isolation, non-targeted Published studies PubMed ID or DOI (citation as per reference list) Total genomes *Representative cases 2010-2020 †Travel associated *Genomes associated with assumed acute typhoid cases, isolated from 2010 onwards from non-targeted sampling frames; this is the subset of data used to generate genotype prevalence distributions shown in Figures 1-3. † Genomes recorded as travel-associated and with known travel to a specific country in this region, associated with assumed acute typhoid isolated from 2010 onwards from non-targeted sampling frames. Table 1 continued sampling, assumed acute illness (see definitions of these variables above). A total of 10,726 genomes met these criteria; the subset of 9478 isolated from 2010 onwards were the focus of the majority of analyses and visualisations, including all prevalence estimates. The prevalence estimates reported in text and figures are simple proportions; 95% confidence intervals (CIs) for proportions are given in text and supplementary tables where relevant. Annual prevalence rates were estimated for countries that had N≥50 representative genomes and ≥3 years with ≥10 representative genomes. Association between MDR prevalence and prevalence of IncHI1 plasmids amongst MDR genomes was assessed for countries with ≥5% MDR prevalence between 2000 and 2020. The significance of increases or decreases in prevalence was assessed using a Chi-squared test for trend in proportions (using the proportion. trend. test function in R). There are no established thresholds for the prevalence of resistance that should trigger changes in empirical therapy recommendations for enteric fever; hence, we defined our own categories of resistance prevalence for visualisation purposes, to reflect escalating levels of concern for empirical antimicrobial use: (i) 0, no resistance detected; (ii) >0 and≤2%, resistance present but rare; (iii) 2-10%, emerging resistance; (iv) 10-50%, resistance common; (v) >50%, established resistance. Robustness of prevalence estimates was assessed informally, by comparing overlap of 95% CIs computed for different laboratories from the same country (for genomes isolated 2010-2020, and laboratories with N≥20 genomes [Southern Asia] or N≥10 [Nigeria] meeting the inclusion criteria during this period).

Overview of available data
A total of 13,000 confirmed Typhi genomes were collated from 65 studies and 5 unpublished public health laboratory datasets (see Table 1, Supplementary file 2). N=35 genomes had assembly sizes outside of the plausible range (4.5-5.5 Mbp, see Figure 1-figure supplement 1), leaving n=12,965 high-quality genomes originating from 110 countries. The distribution of samples by world region (as defined by WHO statistics division M49) is shown in Table 2, with country breakdown in Supplementary file 4. The majority originated from Southern Asia (n=8231), specifically India (n=2705), Bangladesh (n=2268), Pakistan (n=1810), and Nepal (n=1436). A total of n=1140 originated from South-eastern Asia, with >100 each from Cambodia (n=279), Vietnam (n=224), the Philippines (n=209), Indonesia (n=145), and Laos (n=139). Overall, 1106 genomes originated from Eastern Africa, including >100 each from Malawi (n=569), Kenya (n=254), Zimbabwe (n=110). Other regions of Africa were less well represented, with n=384 from Western Africa, n=317 from Southern Africa, n=59 from Middle Africa (so-named in the M49 region definitions, although more commonly referred to as Central Africa), and n=41 from Northern Africa (see Table 2 and Supplementary file 4 for details).
In total, n=10,726 genomes were assumed to represent acute typhoid fever and recorded as derived from 'non-targeted' sampling frames, meaning local population-based surveillance studies or reference laboratory-based national surveillance programmes that could be considered representative Genomes recorded as travel-associated and with known travel to a specific country in this region, associated with assumed acute typhoid isolated from 2010 onwards from non-targeted sampling frames. Countries were assigned to world regions based on the United Nations (UN) Statistics Division standard M49. of a given time (year of isolation) and geography (country and region of origin) (see Methods for definitions). The majority of these isolates (n=9478, 88.4%) originate from 2010 onwards; hence, we focus our reporting of genotype and AMR prevalences on this period. Most come from local typhoid surveillance studies (n=5574) or routine diagnostics/reference laboratory referrals capturing locally acquired (n=1543) or travel-associated (n=2284) cases. All prevalence estimates reported in this study derive from this data subset, unless otherwise stated.

Geographical distribution of genotypes
The breakdown of genotype prevalence by world region, for genomes isolated from non-targeted sampling frames from 2010 onwards, is shown in Figure 1a (denominators in Table 2, full data in Supplementary file 5). Annual breakdown of regional genotype prevalence rates is given in Figure 1-figure supplement 2 (raw data, proportions, and 95% CIs in Supplementary file 5).
Notably, while our data confirm that H58 genotypes (4.3.1 and derived) dominate in Asia, Eastern Africa, and Southern Africa, they were virtually absent from other parts of Africa, from South and Central America, as well as from Polynesia and Melanesia ( Figure 1). Instead, each of these regions was dominated by their own local genotypes. Typhoid fever is no longer endemic in Northern America, Europe, or Australia/New Zealand. The genotype distributions shown for these regions were estimated from Typhi that were isolated locally but not recorded as being travel-associated; nevertheless, these genomes can be assumed to result from limited local transmission of travelassociated infections, and thus to reflect the diversity of travel destinations for individuals living in those regions. Annual national genotype prevalences for well-sampled countries with endemic typhoid are shown in Figure 1b
In Western Africa, the common genotypes were 3.  (Figure 1a). Most of these data come from the Typhoid Fever Surveillance in Africa Programme (TSAP) genomics report (Park et al., 2018) and a study of typhoid in Abuja and Kano in Nigeria , which showed that in the period 2010-2013, 3.1.1 dominated in Nigeria and nearby Ghana and Burkina Faso, whereas 2.3.2 dominated in The Gambia and neighbouring Senegal and Guinea Bissau (Park et al., 2018). Here, we find that additional data from travel cases and recent Nigerian national surveillance (Ikhimiukor et al., 2022a) country are aggregated as 'other'. Full data on regional and national genotype prevalences, including raw counts, proportions, and 95% confidence intervals, are given in Supplementary files 5 and 6, respectively.
Very limited genome data were available from the Middle Africa region (n=19; Table 2). Genomes from Democratic Republic of the Congo (DRC) comprised 16 genotype 2.5.1 isolates (15 isolated locally, plus one from USA CDC) and a single 4.3.1.2.EA3 isolate (from the UK reference lab). Two genomes each were available from Angola (both 4.1.1, via UK) and Chad (both 2.1, via France). Northern Africa was similarly poorly represented, with one isolate from Egypt (0.1, via UK), two from Morocco (0.1, via UK and 1.1, via USA), two from Sudan (genotype 4, via UK), and one from Tunisia (3.3, from UK).

The Americas
Strikingly, Central American isolates were dominated by 2.3.2 (55%, [95% CI, 45.2-64.8%] n=55/100), which was also common in Western Africa (13.9%, [95% CI, 9.7-18.0%]; n=37/266) (Figure 1a). Little has been reported about Typhi populations from this region previously, and the genomes collated here were almost exclusively novel ones contributed via the US CDC and isolated between 2016 and 2019. The available genomes for the period 2010-2020 mainly originated from El Salvador (n=19, 2012-2019, 89% 2.3.2), Guatemala (n=22, 2016-2019, 41% 2.3.2), and Mexico (n=58, 2011-2019, 50% 2.3.2). Prior to 2010, genotype 2.3.2 was also identified in isolates from Mexico referred to the French reference lab in 1972 (representing a large national outbreak; Baine et al., 1977) and 1998. The distance-based phylogeny for 2.3.2 included several discrete clades from different geographical regions in West Africa and the Americas (see Figure 1-figure supplement 4), consistent with occasional continental transfers between these regions followed by local clonal expansions. Three clades were dominated by West African isolates (one with isolates from West Coast countries, and two smaller clades from Nigeria and neighbouring countries); two clades of South American isolates (from Chile, Argentina, and Peru); one small clade of Caribbean (mainly Haiti) and USA isolates; and one large clade of Central American isolates (from Mexico, Guatemala, and El Salvador) (see Figure 1figure supplement 4) There were 105 genomes available from South America, of which 92% (n=97) were from a recent national surveillance study in Chile . South American Typhi were genetically diverse, with no dominant genotype accounting for the majority of cases in the 2010-2020 period (Figure 1a). Genotypes with ≥5% prevalence in the region were 3.5 (27%; n=28/105), 1.1 (18%; n=19/105), 2 (18%; n=19/105), 1.2.1 (5.7%; n=6/105), and 2.0.2 (5.7%; n=6/105). WGS data recently reported by Colombia's Instituto Nacional de Salud (Guevara et al., 2021) were not included in the regional prevalence estimates as they covered only a subset (5%) of surveillance isolates that were selected to maximise diversity, rather than to be representative. However, only four genotypes were detected in the Colombia study (1.1, 2, 2.5, 3.5), and two-thirds of isolates sequenced were genotype 2.5 (67%; n=51/77); 3.5 was also common, at 25% (n=20/77) (Guevara et al., 2021). Similarly, all five isolates from French Guiana (sequenced via the French reference laboratory) were genotype 2.5, consistent with limited diversity and a preponderance of genotype 2.5 organisms in the north of the continent.

Pacific Islands
In Melanesia and Polynesia, each island has their own dominant genotype (Figure 1a): 2.1.7 and its derivatives in Papua New Guinea (n=5/5 in post-2010 genomes, consistent with the longer-term trend) , 3.5.3 and 3.5.4 in Samoa (96%; n=249/259, consistent with a recent report) , and 4.2 and its derivatives in Fiji (97%; n=31/32, consistent with recent data that was not yet available at the time of this analysis) (Davies et al., 2022).

Global distribution of AMR
We estimated the regional and national prevalence of clinically relevant AMR profiles in Typhi for the period 2010-2020, inferred from WGS data from non-targeted sampling frames for which country of origin could be determined (as per genotype prevalences, see Methods). In order to understand the potential implications of these AMR prevalences for local empirical therapy, we categorised them according to a traffic light-style system (see Methods), whereby amber colours signal emerging resistance of potential concern (<10%), and red colours signal prevalence rates of AMR that may warrant reconsideration of empirical antimicrobial use (>10%; see Figure 2 and Figure 2-figure supplement  1). The regional view (Figure 2-figure supplement 1, Supplementary file 7) highlights that CipNS is widespread, whereas CipR, AziR, and XDR have been mostly restricted to Southern Asia. MDR was most prevalent in African regions, and to a lesser degree in Asia. Full country-level data is mapped in Figure 2-figure supplement 2 and detailed in Supplementary file 8. National estimates for countries with sufficient data where typhoid is endemic (≥50 representative genomes available for the period 2010-2020, see Figure 2) (n=0), the Philippines (n=0), Samoa (n=0), Mexico (n=1, 1.7%), and Chile (n=0). The underlying genotypes are shown in Figure 2-figure supplement 3, and highlight that MDR in Asia, Eastern Africa, and Southern Africa has been mostly associated with H58 (i.e. 4.3.1 and derived genotypes) but in Western Africa is associated with the dominant genotype in that region, 3.1.1. In contrast, CipNS was associated with more diverse Typhi genotypes in each country, including essentially all common genotypes in Southern Asian countries (Figure 2-figure supplement 3). National annual prevalence data suggest that AMR profiles were mostly quite stable over the last decade (with the notable exception of the emergence and rapid spread of XDR Typhi in Pakistan) but reveal some interesting differences between settings in terms of AMR trends and the underlying genotypes (see Figure 3, Figure

Multidrug resistant
Prevalence of MDR (co-resistance to ampicillin, chloramphenicol, and co-trimoxazole) has declined in India (p=2 × 10 -9 using proportion trend test) to 2% (0-3% per year, 2016-2020), and is similarly rare in Nepal (mean 5% in 2011-2019) (see Figure 3). MDR prevalence has also declined in Bangladesh (p=2 × 10 -4 using proportion trend test) but remains high enough to discourage deployment of older firstline drugs, with prevalence exceeding 20% in most years (see Figure 3). In Pakistan, the emergence of the XDR strain 4.3.1.1.P1 has driven up MDR prevalence dramatically (p=4 × 10 -11 using proportion trend test), to 87% in 2020 (see Figure 3 and Figure 2-figure supplement 3b). MDR prevalence has remained high in Kenya and Malawi since the first arrival of MDR H58 strains (estimated early 1990s in Kenya [Kariuki et al., 2021]; 2009 in Malawi ), but has declined steadily in Nigeria, from 72% in 2009 to 10% in 2017 (p=3 × 10 -4 using proportion trend test; see Figure 3). All MDR isolates in Nigeria were genotype 3.1.1 and carried large IncHI1 MDR plasmids, which are associated with a fitness cost (Doyle et al., 2007). Chromosomal integration of the MDR transposon, which accounted for 100% of MDR in Malawi and 19% in Kenya (all in H58 genotype backgrounds), is associated with comparably lower fitness cost; and this difference in fitness cost may explain why MDR has remained at high prevalence in some settings (where resistance is chromosomally integrated) while declining in other settings (where resistance is plasmid-borne). Figure 3-figure supplement 2 shows prevalence of MDR overlaid with prevalence of IncHI1 plasmid carriage amongst MDR strains. Two countries showed a significant rise in MDR prevalence (Pakistan, p=4 × 10 -11 ; South Africa, p=9 × 10 -8 ); in both countries, this rise coincided with loss of IncHI1 plasmids (see Figure 3-figure supplement 2) and assumed migration of MDR to the chromosome (as has been clearly shown in XDR 4.3.1.1.P1 strains in Pakistan) (Klemm et al., 2018). A decline in the prevalence of MDR over time was observed in Cambodia as in Nigeria, whereby all MDR strains belonged to the same genotype (4.3.1.1 in Cambodia, 3.1.1 in Nigeria) and carried the IncHI1 plasmid (see Figure 3-figure supplement 2). As noted above, MDR was maintained at high prevalence rates in Kenya and Malawi, where the IncHI1 plasmid frequency was either in decline (Kenya) or entirely absent (Malawi; see Figure 3-figure supplement 2). Notably, a significant decline in total MDR prevalence was observed in Bangladesh (p=2 × 10 -4 ), and in MDR prevalence within the dominant genotype 4.3.1.1 (p=0.049), despite the majority of MDR (and all MDR within 4.3.1.1) being chromosomal rather than plasmid-associated (Rahman et al., 2020;da Silva et al., 2022). However, as noted above, MDR did persist in Bangladesh (exceeding 20% prevalence in most years). This is consistent with the hypothesis that the MDR plasmid is associated with a fitness cost that is removed when the MDR transposon becomes chromosomally integrated.

Extensively drug resistant
The XDR 4.3.1.1.P1 sublineage (i.e. MDR with additional resistance to fluoroquinolones and thirdgeneration cephalosporins including ceftriaxone) was recognised as emerging in late 2016 in Sindh Province, where it caused an outbreak of XDR typhoid that has since spread throughout Pakistan (Klemm et al., 2018;Nair et al., 2021;Rasheed et al., 2020). Here, we identified the genome of strain Rwp1-PK1 (assembly accession NIFP01000000), isolated from Rawalpindi in July 2015, as genotype 4.3.1.1.P1. Rwp1-PK1 was isolated from a 17-year-old male with symptomatic typhoid whose infection did not resolve following ceftriaxone treatment and was found to be phenotypically XDR (resistant to ampicillin, co-trimoxazole, chloramphenicol, ciprofloxacin, ceftriaxone) (Munir et al., 2016). The isolate was later sequenced and reported as carrying bla CTX-M-15 , bla TEM-1 , qnrS1, and GyrA-S83F (Gul et al., 2017), but was not genotyped nor included in comparative genomics analyses investigating the emergence of XDR in Pakistan, so has not previously been recognised as belonging to the 4.3.1.1.P1 XDR sublineage. We found that the Rwp1-PK1 genome carries the 4.3.1.1.P1 marker SNV, clusters with the 4.3.1.1.P1 sublineage in a core-genome tree (Figure 4), and shares the full set of AMR determinants typical of 4.3.1.1.P1, indicating that this XDR strain was present in northern Pakistan for at least a full year before it was reported as causing outbreaks in the southern province of Sindh.

Strengths and limitations
This study presents the most comprehensive genomic snapshot of Typhi to date, with 12,965 highquality genomes originating from 110 countries in 21 world regions. The consortium model provides improved consistency and completeness of source data aggregated from 77 laboratories and 66 unique studies. Our dataset also includes 1290 novel genomes sequenced by public health laboratories that would not otherwise have been published, including travel data from countries not previously represented in published Typhi genomics studies (e.g. El Salvador, Guatemala, Haiti, Mexico, and Peru). However, it is a post hoc analysis of isolates that were cultured in different contexts (including routine diagnostics, as well as study settings where culture would not normally be undertaken) and sequenced for different reasons (including retrospective studies, outbreak investigations, and routine surveillance). The study therefore has important limitations, most notably the scarcity of genomic data from many countries and world regions where typhoid is believed to be endemic (GBD 2019Antimicrobial Resistance Collaborators, 2022, including Northern and Middle Africa, Western Asia, as well as Central and South America (Figures 1-3 Figure 3-figure supplement 1). These genomic data gaps reflect an underlying lack of routine blood culture or sustained blood-culture surveillance, and limited resources and expertise in many settings (Ikhimiukor et al., 2022b;Iskandar et al., 2021). In addition, public health authorities may be disincentivized to generate, analyse, and publish genomic data; we hope that this analysis strengthens the case for data generation and sharing for public good. Substantial investments have been made in recent years to improve and expand microbiological surveillance capacity in some low-and middle-income countries, but major regional surveillance gaps remain. It is therefore important to maximise information recovery from available data sources, especially WGS, which provides data on the emergence and spread of AMR variants. While the inference of AMR phenotype from WGS is currently highly reproducible and accurate for Typhi (Argimón et al., 2021b;Chattaway et al., 2021), continued phenotypic antimicrobial susceptibility testing remains crucial to monitor for emerging mechanisms and to guide changes in empiric therapy.
For now, routine sequencing of travel-associated Typhi infections diagnosed in high-income countries helps to fill some molecular surveillance gaps for some regions, assuming that accurate travel history is available and the sequence and metadata (including country of origin) are shared (Ingle et al., 2019). For example, our study included >3000 genomes shared by public health reference laboratories in England, Australia, New Zealand, France, Japan, and the USA. These infections mostly originate in other countries, and can in principle provide informative, if informal, sentinel surveillance for pathogen populations in countries with strong travel and/or immigration links to those with routine sequencing (Ingle et al., 2019). Indeed, for some countries and regions, travel data represented   most or all of the available genome data (see Table 2, Supplementary file 4). In this study, where multiple data sources were available for the same country, we found that national genotype and AMR prevalence estimates for the period 2010-2020 were largely concordant between local surveillance studies and travel-associated cases captured elsewhere (Figures 6-7, Figure 6-figure supplements 1-2), particularly when comparing contemporaneous annual prevalence estimates (Figures 6 and  7c). This clearly shows that travel-associated Typhi isolated in low burden countries can be informative for surveillance of some high burden countries, which should serve as incentive for public health reference laboratories to share their data to the fullest extent they are able to under local regulations and encourage culture-based diagnostics in those countries that rely primarily on clinical diagnosis of typhoid fever in local populations, and development of molecular diagnostic tests for local use as travel-associated infections may provide information on predominant genes encoding resistance.
Another key limitation stemming from the post hoc nature of this study is that it is hard to assess how representative the prevalence estimates are for a given region/country and timeframe. The GTGC has developed new source/metadata standards for Typhi (see Methods), that include information on the purpose of sampling, which were completed by the original owners of each dataset (data available in Supplementary file 2). Such 'purpose-of-sampling' fields are currently lacking from metadata templates used for submission of bacterial genomes to the public sequencing archives (e.g. NCBI, ENA), and our approach was modelled on that established for sharing of SARS-CoV-2 sequence data, designed by the PHA4GE consortium (Griffiths et al., 2022). In this study, the purpose-of-sampling information was used to identify the subset of genome data that could be reasonably considered to be representative of national annual trends in genotype and AMR prevalence for public health surveillance purposes (n=9478 genomes post 2010; Figures 1-3). These originate mainly from local typhoid surveillance studies (59%), or routine diagnostics/surveillance capturing locally acquired (19%) and travel-associated (24%) infections. The comparisons of estimates for a given country based on different sources of genomes (Figures 6-7, Figure 6-figure supplements 1-2) are reassuring that the general scale and trends of AMR prevalence are reliable. The genome-based estimates are also in broad agreement with available phenotypic prevalence data on AMR in Typhi (Browne et al., 2020;Kariuki et al., 2015), although systematic aggregation of susceptibility data is limited. Both phenotypic and genomic analyses necessarily reflect blood-culture-confirmed cases, which may be biased towards more resistant infections resulting in overestimation of AMR prevalence. Notably, the genome data adds an additional layer of information on resistance mechanisms and the emergence and spread of lineages or variants. Importantly, our study clearly shows that, whilst much attention has been given to the emergence and spread of drug-resistant H58 Typhi, other clones predominate outside of Southern Asia and Eastern Africa ( Figure 1) and can be associated with CipNS ( Figure 2-figure supplements  3-4), azithromycin (Figure 6), or ceftriaxone (Table 3), the drugs currently recommended by the World Health Organization as first choice treatment for enteric fever (World Health Organization, 2022).

AMR
Our data demonstrate that CipNS is emerging or established in all regions except Melanesia (here represented by n=35 genomes from Fiji and Papua New Guinea, mainly from 2010, although more recent reports support a lack of CipNS in Fiji [Davies et al., 2022;Getahun Strobel et al., 2019]; see Figure 2-figure supplement 1). For countries with sufficient data to assess (≥50 genomes), CipNS was emerging or established in all countries except Ghana (Figure 2, Figure 3-figure supplement  1), with no evidence of declining prevalence (Figure 3, Figure 3-figure supplement 1). A diverse range of genotypes and QRDR mutations are involved (Figure 2-figure supplements 3-4), likely reflecting the lack of fitness cost associated with these mutations (Baker et al., 2013). That QRDR mutations are so widespread is highly concerning, as infections with CipNS strains can take longer to resolve, and full clinical resistance can emerge relatively easily against this background, through acquisition of either a mobile qnr gene (as occurred in 4.3.1.1.P1 in Pakistan) or additional QRDR mutations (as occurred in 4.3.1.2.1 in India). Notably, the data suggest that CipR typhoid is now a multidrug resistant; XDR, extensively drug resistant; CipNS, ciprofloxacin non-susceptible; CipR, ciprofloxacin resistant; CefR, ceftriaxone resistant; AziR, azithromycin resistant. See Supplementary file 9 for three-letter laboratory code master list.

Figure 7 continued
well-established problem across Southern Asia and is emergent in Chile, Mexico, and South Africa (Figures 2 and 3, Figure 2-figure supplement 1, Figure 3-figure supplement 1). A recent study estimating national annual antibiotic consumption highlighted differences in rates of fluoroquinolone usage between regions and countries, which could potentially drive these differences in resistance prevalence (Browne et al., 2021). The highest rates of fluoroquinolone consumption were estimated in South Asian countries, rising from 1.67 defined daily doses (DDD) per 1000 per day in 2000 to 2.81 DDD/1000/day in 2010 and 2.94 DDD/1000/day in 2018 (see https://www.tropicalmedicine.ox. ac.uk/research/oxford/microbe/gram-project/antibiotic-usage-and-consumption). Fluoroquinolone consumption was also estimated to increase substantially in Latin America, rising from 0.64 DDD/1000/ day in 2000 to 1.85 DDD/1000/day in 2010 and 2.26 DDD/1000/day in 2018. Our data show that the highest incidence of CipR burden is associated with four main variants (Figure 2-figure supplement  5). In Pakistan, India, and Bangladesh, it is associated with locally emerged variants; however, the relatively high burden in Nepal is associated with variants acquired from India (Britto et al., 2018;Pham Thanh et al., 2016a). In other regions, CipR burden is low and so far linked mainly to the spread of 4.3.1.2.1 (Britto et al., 2020;da Silva et al., 2022) out of India (Britto et al., 2020;da Silva et al., 2022), plus occasional de novo emergence of resistant variants, which show no evidence of geographical spread (Figure 2-figure supplement 5). However, the high rates of CipNS in Kenya (53%) and Nigeria (40%) are concerning, especially given the increasing usage of fluoroquinolones in these countries (estimated 2.1 DDD/1000/day in 2018 in Kenya and 2.76 DD/1000/day in Nigeria) (Browne et al., 2021), which could potentially drive local emergence and spread of CipR.
While resistance to azithromycin and ceftriaxone have been detected ( Table 3,  To our knowledge, there are no data reported on the fitness cost of acrB mutations or CefR plasmids in Typhi; however, the genomic evidence suggests a higher fitness cost compared with QRDR mutations, providing further support for the use of ceftriaxone or azithromycin over ciprofloxacin as we work to introduce preventative measures. Most instances of ESBL-gene carriage in Typhi (conferring CefR phenotype) have been short-lived (Table 3), suggesting selection against the acquisition of new ESBL genes or plasmids. The expansion and dominance of the XDR 4.3.1.1.P1 genotype in Pakistan is obviously concerning (Figures 4 and 6, Figure 2-figure supplement 3a, Figure 3-figure supplement 1); however, despite circulating at high prevalence in Pakistan for more than 5 years, the strain remains azithromycin-susceptible. There is also limited evidence of local transmission of 4.3.1.1.P1 in other countries; however, most countries near Pakistan have limited data available. A short local outbreak of XDR 4.3.1.1.P1 was reported in China, linked to contamination of an apartment block's water  and non-travel-associated cases have been reported in the USA (Hughes et al., 2021). Notably, a CefR+CipR lineage of 4.3.1.2.1 that appears to be well established in Mumbai, India, has been isolated only occasionally since 2015 (Argimón et al., 2021b;Chattaway et al., 2021;Ingle et al., 2021;Jacob et al., 2021; Table 3); however, this is the only example of persistence of a CefR strain besides 4.3.1.1.P1, and there is no evidence it has yet spread outside Mumbai. We hypothesise that the lack of widespread dissemination of 4.3.1.1.P1 and ESBLpositive 4.3.1.2.1 so far may be due to the fitness cost imposed by the associated plasmids (~85 Kbp IncY plasmid in 4.3.1.1.P1 [Klemm et al., 2018]; ~43 Kbp IncX3 plasmid in 4.3.1.2.1 [Argimón et al., 2021b]). The temporal trend data on MDR prevalence and IncHI1 plasmids (Figure 3-figure supplement 2) suggest that migration of the MDR locus from the plasmid to the chromosome may have mitigated the fitness cost associated with plasmid-borne MDR. The same may be true for ESBL genes, that is, the movement of the ESBL locus from the plasmid to the chromosome (as has recently been reported in 4.3.1.1.P1; Nair et al., 2021) may result in a fitter CefR or XDR variant that can spread more easily. Our data show that acrB mutations are occurring spontaneously and independently in multiple locations across a variety of genetic backgrounds ( Figure 5). While they are still not prevalent, increased use of azithromycin through public health programmes (e.g. trachoma elimination) as well as widespread misuse of azithromycin to treat SARS-CoV-2 infections and use of azithromycin as first-line therapy for typhoid-like illness may lead to increased selection pressure. It will therefore be important to maintain and expand genomic surveillance, particularly in typhoid endemic countries where azithromycin is used widely. It is also notable that, while they are rare overall, acrB mutations have already arisen in two of the most common CipR lineages (4.3.1.2 and 4.3.1.3.Bdq); this relatively frequent co-occurrence warrants continued monitoring and investigation. While we did not detect the mobile AziR gene mphA, it is circulating in other S. enterica serovars Tack et al., 2022) and other enteric bacteria that share plasmids with Typhi (including the human-specific Shigella; Baker et al., 2018), providing another potential mechanism for emergence of AziR in Typhi.

Applications of genomic surveillance for typhoid fever control
We are at a pivotal stage in the history of typhoid control. Wider access to clean water and improved sanitation have led to a major reduction in global incidence of typhoid fever, which has also been reflected in declining incidence of other enteric diseases (Steele et al., 2016). This should continue but will require sustained investment from national and local governments and thus remains a longterm objective. In the short to medium term, widespread use of TCVs can help to further reduce global incidence of typhoid fever. The WHO has prequalified two TCVs and recommended their use in endemic countries, as well as settings where a high prevalence of AMR Typhi has been reported (World Health Organization, 2018). Gavi, the Vaccine Alliance, has committed funds to support the procurement and distribution of TCVs in typhoid endemic countries (Gavi: The Vaccine Alliance, 2023a; Gavi: The Vaccine Alliance, 2023b). Five countries have undertaken Gavi-supported national introductions (Pakistan, Liberia, Zimbabwe, Nepal, Malawi) and one country has self-financed a national introduction (Samoa) (Neuzil, 2020;Sikorski, 2020). In Pakistan and Zimbabwe, TCV introduction was stimulated by the occurrence of AMR Typhi outbreaks in major urban centres, highlighting that the case for prevention can be stronger when curative therapy is less available. Additional support is likely required to inform TCV decision-making in other typhoid endemic countries, particularly where burden and AMR data are scarce.
With increasingly limited treatment options, vaccines are an even more important tool to mitigate the public health burden of AMR Typhi, both through the prevention of drug-resistant infections and through broader, indirect effects, like reduction of empiric antimicrobial use leading to reduced selection pressure. While TCVs have been shown to be highly effective against drug-resistant Typhi (Batool et al., 2021;Yousafzai et al., 2021), public health policymakers have to weigh the value of TCVs against other competing immunisation priorities. While TCV introduction is scaled up globally, antimicrobial stewardship should also be prioritised. Aggregated, representative data showing distribution and temporal trends in AMR can inform local treatment guidelines to extend the useful lifespan of antimicrobials licensed to treat typhoid fever, potentially including reverting to former last-line drugs in some settings. The traffic light system presented in this analysis (see Figure 2 and Figure 2-figure supplement 1) provides a framework for monitoring trends in AMR and adjusting empiric therapy guidelines accordingly. The WHO recently released its AWaRe (Access, Watch, Reserve) treatment guidelines (World Health Organization, 2022), which indicate that choice of empiric therapy should be guided by severity of presentation and local risk of fluoroquinolone resistance; if low risk, oral ciprofloxacin is recommended for both mild and severe cases and if there is a high risk of fluoroquinolone resistance, oral azithromycin is recommended for mild cases and intravenous ceftriaxone is recommended for severe cases. However, the guidelines do not indicate which prevalence rate of resistance should warrant avoidance of treatment with ciprofloxacin, nor do they indicate where high prevalence rates of resistance might be expected, although it is noted that drug resistance is most prevalent in Asia. There is an opportunity to further refine these recommendations with additional, local information about AMR prevalence and trends over time. Additional data are required from resource-limited settings, where typhoid fever diagnosis is often based on clinical presentation, to optimise these recommendations.
Genomic surveillance has a particularly important role to play in monitoring for changes in clinically important resistances in Typhi, as a shift in resistance mechanism or early evidence of clonal spread, which can only be identified definitively using WGS, could provide early warning of a likely increase in prevalence. This study provides an analytical framework for Typhi genomic analysis, based on an open, robust, reproducible data flow and analysis framework leveraging open-access online data analysis platforms (Typhi Mykrobe for read-based genotyping ; the GHRU pipeline for genome assembly [Underwood, 2020], and Typhi Pathogenwatch for assembly-based genotyping and tree-building [Argimón et al., 2021b]). We have made available all data processing and statistical analysis code, and underlying sequence and metadata, via GitHub and FigShare (see Methods). Together, these provide (i) a comprehensive data and code resource for the research and public health communities interested in typhoid surveillance data; (ii) a model for the inclusion of WGS in project-based or routine surveillance studies of typhoid that can be readily replicated and adapted; and (iii) a sustainable model for aggregated analysis of typhoid genomic surveillance data that can readily incorporate new data and extract features (genotypes, AMR determinants, plasmid replicons) of importance to clinical and public health audiences. Notably, this consortium-driven effort shows that new insights can be gained from aggregated analysis of published data, which were not evident from the individual contributing studies, for example (i) the XDR strain 4.3.1.1.P1 existed in Pakistan in 2015, a year earlier than previously reported ( Figure 4); (ii) the CefR+CipR strain reported in Mumbai (Argimón et al., 2021b;Jacob et al., 2021) has persisted between at least 2015 and 2020 and is now more easily identified as 4.3.1.2.1 with bla SHV-12 ; (iii) persistence of MDR in certain settings is correlated with migration of MDR from plasmid to chromosome (Figure 3-figure supplement 2), which has implications for the future persistence and potentially spread of ESBL strains.
This dataset provides clear, actionable information about the distribution and temporal trends in AMR across multiple countries and regions. Where data gaps exist, the potential of travel-associated data to serve as 'sentinel' surveillance has been demonstrated previously by Ingle et al., 2019, and supported by additional data included in this analysis. These data can and should inform prioritisation of TCV introduction and improvements to WASH infrastructure. Sustaining and expanding genomic surveillance can also facilitate measuring the impact of TCV introduction on local bacterial populations, as has been done for previous vaccines like pneumococcal conjugate vaccines. In addition, monitoring for potential 'strain replacement' with other Salmonella serovars following TCV introduction can and should inform the prioritisation of the development and deployment of future combination Salmonella vaccines.
The SARS-CoV-2 pandemic illustrated the power of open, continuous data sharing and crowdsourced analysis, and the importance of ensuring that genomic surveillance leads to local benefits. The scale of this analysis, which was made possible through the efforts of an extensive network of collaborators, enables the extraction of key insights of public health relevance. The authors hope that this consortium effort serves as a starting point for continued data generation and sharing and collective analysis, with additional participation from an expanded group of stakeholders. In particular, we hope that researchers and public health authorities from areas with little publicly available data see the value of reporting and sharing genomic data for collective public health benefit. In addition, we hope that the current momentum for donor and government support of molecular surveillance is sustained, so that additional groups are able to generate their own data and fill regional data gaps to inform local public health action.

Supplementary files
• Supplementary file 1. Details of local ethical approvals provided for studies that were unpublished at the time of contributing data to this consortium project. Most data are now published, and the citations for the original studies are provided here. National surveillance programs in Chile , Colombia (Guevara et al., 2021), France, New Zealand, and Nigeria (Ikhimiukor et al., 2022b) were exempt from local ethical approvals as these countries allow sharing of nonidentifiable pathogen sequence data for surveillance purposes. The US CDC Internal Review Board confirmed their approval was not required for use in this project (#NCEZID-ARLT-10/20/21-fa687).
• Supplementary file 2. Line list of 13,000 genomes included in the study.
• Supplementary file 3. Source information recorded for genomes included in the study. ^Indicates cases included in the definition of 'assumed acute illness'.
• Supplementary file 4. Summary of genomes by country.
• Supplementary file 9. Laboratory code master list. Three letter laboratory codes assigned by the consortium.

• MDAR checklist
Data availability All data analysed during this study are publicly accessible. Raw Illumina sequence reads have been submitted to the European Nucleotide Archive (ENA), and individual sequence accession numbers are listed in Supplementary file 2. The full set of n=13,000 genome assemblies generated for this study are available for download from FigShare: https://doi.org/10.26180/21431883. All assemblies of suitable quality (n=12,849) are included as public data in the online platform Pathogenwatch (https://pathogen.watch). The data are organised into collections, which each comprise a neighbourjoining phylogeny annotated with metadata, genotype, AMR determinants, and a linked map. Each contributing study has its own collection, browsable at https://pathogen.watch/collections/all?or-ganismId=90370. In addition, we have provided three large collections, each representing roughly a third of the total dataset presented in this study: Typhi 4.3.1.1 (https://pathogen.watch/collection/ 2b7mp173dd57-clade-4311), Typhi lineage 4 (excluding 4.3.1.1) (https://pathogen.watch/collection/ wgn6bp1c8bh6-clade-4-excluding-4311), and Typhi lineages 0-3 (https://pathogen.watch/collection/9o4bpn0418n3-clades-0-1-2-and-3). In addition, users can browse the full set of Typhi genomes in Pathogenwatch and select subsets of interest (e.g. by country, genotype, and/or resistance) to generate a collection including neighbour-joining tree for interactive exploration.
The following dataset was generated: