Population structure and antimicrobial resistance patterns of Salmonella Typhi and Paratyphi A amid a phased municipal vaccination campaign in Navi Mumbai, India

ABSTRACT We performed whole-genome sequencing of 174 Salmonella Typhi and 54 Salmonella Paratyphi A isolates collected through prospective surveillance in the context of a phased typhoid conjugate vaccine introduction in Navi Mumbai, India. We investigate the temporal and geographical patterns of emergence and spread of antimicrobial resistance. We evaluated the relationship between the spatial distance between households and genetic clustering of isolates. Most isolates were non-susceptible to fluoroquinolones, with nearly 20% containing ≥3 quinolone resistance-determining region mutations. Two H58 isolates carried an IncX3 plasmid containing blaSHV-12, associated with ceftriaxone resistance, suggesting that the ceftriaxone-resistant isolates from India independently evolved on multiple occasions. Among S. Typhi, we identified two main clades circulating (2.2 and 4.3.1 [H58]); 2.2 isolates were closely related following a single introduction around 2007, whereas H58 isolates had been introduced multiple times to the city. Increasing geographic distance between isolates was strongly associated with genetic clustering (odds ratio [OR] = 0.72 per km; 95% credible interval [CrI]: 0.66–0.79). This effect was seen for distances up to 5 km (OR = 0.65 per km; 95% CrI: 0.59–0.73) but not seen for distances beyond 5 km (OR = 1.02 per km; 95% CrI: 0.83–1.26). There was a non-significant reduction in odds of clustering for pairs of isolates in vaccination communities compared with non-vaccination communities or mixed pairs compared with non-vaccination communities. Our findings indicate that S. Typhi was repeatedly introduced into Navi Mumbai and then spread locally, with strong evidence of spatial genetic clustering. In addition to vaccination, local interventions to improve water and sanitation will be critical to interrupt transmission. IMPORTANCE Enteric fever remains a major public health concern in many low- and middle-income countries, as antimicrobial resistance (AMR) continues to emerge. Geographical patterns of typhoidal Salmonella spread, critical to monitoring AMR and planning interventions, are poorly understood. We performed whole-genome sequencing of S. Typhi and S. Paratyphi A isolates collected in Navi Mumbai, India before and after a typhoid conjugate vaccine introduction. From timed phylogenies, we found two dominant circulating lineages of S. Typhi in Navi Mumbai-lineage 2.2, which expanded following a single introduction a decade prior, and 4.3.1 (H58), which had been introduced repeatedly from other parts of India, frequently containing “triple mutations” conferring high-level ciprofloxacin resistance. Using Bayesian hierarchical statistical models, we found that spatial distance between cases was strongly associated with genetic clustering at a fine scale (<5 km). Together, these findings suggest that antimicrobial-resistant S. Typhi frequently flows between cities and then spreads highly locally, which may inform surveillance and prevention strategies.


Setting and study population
Navi Mumbai is a city with a population of 1.12 million, including an estimated 129,500 children aged 0-6 years (19). In September of 2018, a vaccine campaign was conduc ted in a randomly selected 11 of 22 urban health post areas (designated phase 1), to children ages 9 months to 14 years. Coverage of the campaign was estimated to be 71% among age-eligible children living in these communities (18). The original plan was for the remaining communities to receive vaccination beginning 2 years later (phase 2); however, due to the COVID-19 pandemic, the second vaccine campaign has been delayed. To evaluate the effectiveness of the vaccine campaign, we performed prospec tive surveillance at six hospitals between June 2018 and March 2021. We recruited children with suspected enteric fever and performed blood culture for consenting participants (18). Additionally, culture-confirmed typhoid cases from a large private laboratory in the city were recruited into the study between June 2018 and March 2021. All culture-confirmed cases from the parent study were eligible for inclusion in this genomic epidemiology study.

Bacterial identification and antimicrobial susceptibility testing
S. Typhi and S. Paratyphi A were identified by biochemical profile and serotyping, and later confirmed by whole-genome sequencing (WGS). Antimicrobial susceptibil ity to ampicillin, co-trimoxazole, chloramphenicol, ciprofloxacin, and ceftriaxone was determined using the disk diffusion method (Oxoid, Thermo Scientific, MA, USA). All zone diameters were interpreted according to EUCAST v8.0 clinical breakpoints.

Whole-genome sequencing
Genomic DNA was extracted using Promega Wizard Genomic DNA Purification Kits (Promega Corporation, UK). WGS was performed at Genotypic Technology Pvt Ltd (Bangalore, India) using the Illumina Hiseq X Ten platform (Illumina, San Diego, CA, USA) to generate paired-end reads of 100-150 bp in length. Sequence data quality was checked using FastQC v0.11.9 to remove low-quality reads (20). We summarized all quality indicators using MultiQC v1.7 (21). Species identification was confirmed with Kraken2 (22), and the Salmonella in silico Typing Resource was used for WGS-based serotyping (23). Raw data assembly was achieved using SPADES assembler. Short Read Sequence Typing for Bacterial Pathogens (SRST2) (24) was used to map known alleles and identify MLSTs directly from reads according to the Salmonella enterica MLST scheme (https://pubmlst.org/salmonella/).

Mapping and SNP analysis
Paired-end Illumina reads were mapped to the S. Typhi CT18 (accession no. AL513382) reference chromosome sequence using RedDog mapping pipeline v1beta.11 (https:// github.com/ katholt/reddog). RedDog uses Bowtie2 v2.4.1 (25) to map reads to the reference genome and SAMtools v1. 10 (26) to identify single-nucleotide polymorphisms (SNPs) that have a phred quality score above 30 and to filter out those SNPs supported by less than five reads, or with 2.5× the average read depth that represents repeated sequences, or those that have ambiguous base calls. For each SNP that passes these criteria in any one isolate, consensus base calls for the SNP locus were extracted from all genomes mapped, with those having phred quality scores under 20 being treated as unknown alleles and represented with a gap character.
Chromosomal SNPs with confident homozygous calls (phred score above 20) in >95% of the genomes mapped (representing a "soft" core genome) were concatenated to form an alignment of alleles using the RedDog python script parseSNPtable.py with parameters -m cons, aln, and -c 0.95, and SNPs called in prophage regions and repetitive sequences (354 kb; ~7.4% of bases) in the CT18 reference chromosome, as defined previously (27), were excluded. SNPs occurring in recombinant regions were detected by Gubbins v2.4.1 (28) and excluded. The SNP data were used to assign all isolates to previously defined genotypes according to an extended S. Typhi genotyping framework using the GenoTyphi python script (https://github.com/katholt/genotyphi) (27).
To characterize and analyze the genomes of the 54 S. Paratyphi A strains isolated between June 2018 and March 2021, a similar bioinformatic process was adopted using S. Paratyphi A AKU_12601 [27] (accession no: FM200053) as the reference genome to create an alignment with another selected 108 Indian isolates from previous studies (29,30). Genotypes were assigned according to an S. Paratyphi A genotyping framework (31) using the Paratype script (https://github.com/CHRF-Genomics/Paratype).

Analysis of spatial distance and genetic clustering
We sought to evaluate whether spatial distance was associated with genetic clustering, hypothesizing that pairs of isolates from individuals whose homes are closer to one another are more likely to be genetically similar compared with pairs of individuals who live further from one another. One of the obstacles to testing this hypothesis is that individuals are represented across multiple pairs, such that the spatial and genetic relatedness of each pairwise comparison are not independent. To overcome this, we leveraged a recently developed hierarchical Bayesian modeling approach that accom modates this type of network dependence, along with spatial correlation, by incorporat ing spatially correlated individual-level random effect parameters into the regression framework (32). This approach has been implemented in an R package (GenePair). We evaluated an outcome of genetic clustering, defined as pairs of isolates that have fewer than six SNPs between them, following earlier literature (33), and in sensitivity analyses, we examined other thresholds (3 SNPs, 12 SNPs). We extended this model using splines to investigate whether the relationship between spatial distance and clustering varied as a function of distance. We fit conventional generalized additive models, using the mgcv package in R, to first visualize the relationship between spatial distance and clustering in models not accounting for network dependence and spatial correlation and then refit the GenePair models introducing splines at different points. Finally, to test whether isolates from individuals in vaccination clusters were more or less likely to genetic cluster compared with isolates from individuals in non-vaccination clusters, we fit models introducing dummy variables for pairs in which none, one, or both individuals were in a vaccination cluster. Each model was run where we collected 100,000 iterations from the algorithm; the first 50,000 were discarded prior to convergence of the model, and the remainder were thinned by a factor of 10 prior to summarizing the posterior distribution for each parameter in order to reduce posterior autocorrelation. We report the posterior median and quantile-based 95% credible intervals (CrIs) for each estimate. Models were checked visually and by using Geweke's Z score to ensure convergence.

Phylogenetic analyses
Maximum likelihood (ML) phylogenetic trees were inferred from the chromosomal SNP alignments using RAxML v8.2.10 (34) (command raxmlHPC-PTHREADS). A generalized time-reversible model and a gamma distribution were used to model site-specific rate variation (GTR+ Γ substitution model; GTRGAMMA in RAxML) with 100 bootstrap pseudo-replicates used to assess branch support for the ML phylogeny. We selected the single tree with the highest likelihood score as the best tree. The resulting phylogenies were visualized and annotated using the iTOL v5 online version (35).

Temporal and phylogeography analysis
To investigate dates of emergence and geographical transfers, we inferred timed phylogenies using temporally representative samples. To estimate evolutionary rates and times of common ancestry of isolates, we used treedater R package with an uncorrelated relaxed molecular clock and repeated the procedure 100 times (36). Finally, we reconstructed the ancestral state of nodes using the maximum parsimony approach with Phangorn R package (https://www.rdocumentation.org/packages/phan gorn/versions/2.8.1), considering events with a location probability of >0.5 between connected nodes. We considered a geographic transfer when the most probable location between two connected nodes (or between a node and a tip) differed, and we consid ered the time window of transfer as the date range between the nodes (or between the node and tip). The geospatial transmissions of lineage strains from the phylogeographic reconstructions were analyzed and visualized using ArcMap 10.7.1 (https://desktop.arc gis.com/en/arcmap/).

Resistome analysis
ARIBA (Antimicrobial Resistance Identifier by Assembly) v2. 10.0 and CARD data base v1.1.8 (https://card.mcmaster.ca/home) were used to investigate antimicrobial resistance (AMR) gene content. Point mutations in the QRDR of the DNA-gyrase gyrA/B and topoisomerase-IV parC/E genes, associated with reduced susceptibility to fluoroquinolones and quinolone resistance genes (qnrS), were also detected using ARIBA. Isolates were defined as being MDR if resistance genes were detected by ARIBA in the β-lactams, trimethoprim, sulfonamides, and chloramphenicol classes. Plasmid replicons were identified using ARIBA and the PlasmidFinder database (30).

The population structure of S. Typhi isolates in Navi Mumbai
A total of 174 S. Typhi isolates were available for genome sequencing. Among these, 33 (19%) were collected before the vaccine campaign began, and the remainder (141) were collected after. Genotype analysis showed that the pathogen population structure in Navi Mumbai is diverse, with 10 distinct genotypes identified (Fig. 1A). Genotype 4.3.1 (H58) was dominant, accounting for over half (97/174; 55.7%) of all isolates. The major sublineages of H58 (lineage I and lineage II) were present among our isolates, with lineage II (genotype 4.3.1.2) comprising the majority (75/97; 77%) of H58 isolates. The second most-prevalent genotype was 2.2 (33%; 58/174), and isolates from this clade were closely related. The two main clades circulating in Navi Mumbai (genotypes 2.2 and 4.3.1.2) were seen throughout Navi Mumbai, and we identified very little spatial aggregation of genotypes (Fig. 1B).

Spatial distance and genetic clustering
In hierarchical Bayesian models, we found that increasing geographic distance between isolates was strongly associated with genetic clustering (OR = 0.72 per km; 95% CrI: 0.66-0.79), using the six SNP threshold (Table 1). This effect was seen for distances up to 5 km (OR = 0.65 per km; 95% CrI: 0.59-0.73) but was not seen for distances beyond 5 km (OR = 1.02 per km; 95% CrI: 0.83-1.26). We observed no significant differences between the coefficients using shorter thresholds (e.g., 0-1 vs 0-5 km). The relationship between spatial distance and genetic clustering was robust to the SNP distance threshold: OR per km of 0.69 (95% CrI: 0.62-0.76) for SNP threshold of 3 and OR per km of 0.79 (95% CrI: 0.74-0.84) for SNP threshold of 12. There was a non-significant reduction in odds of clustering for pairs of isolates in vaccination communities compared with pairs in non-vaccination communities (OR = 0.42; 95% CrI: 0.11-1.43) or mixed pairs compared with pairs in non-vaccination communities (OR = 0.73; 95% CrI: 0.19-2.54).

Intra-country transmission within India
To provide context for the genomes from Navi Mumbai and better understand tempo ral and spatial distribution of lineages, we constructed a whole-genome phylogeny, including 1,357 additional S. Typhi genomes previously sequenced from 17 cities in India (Fig. S1). Within 4.3.1 lineage, there was no tight clustering observed among the Navi Mumbai isolates. Instead, our study isolates mainly clustered with previously sequenced isolates from neighboring Mumbai, indicating frequent transmission of 4.3.1 isolates between these cities. In comparison, all 2.2 isolates from Navi Mumbai were very closely related, and no clustering was observed with those from other cities.

Evolutionary history of S. Typhi isolates in India
We generated dated phylogenies to reconstruct the evolutionary history and geographic spread of the two main lineages (2.2 and 4.3.1) circulating in Navi Mumbai. Our analysis estimated the most recent common ancestor (tMRCA) of all S. Typhi H58 (4.3.1) isolates in India existed around 36 years ago (1986) and was first introduced in Navi Mumbai between 1989 and 2000 (Fig. S2). Phylogeographic reconstruction indicated that the most common origin of H58 isolates observed in Navi Mumbai were Vellore (n = 8) and Mumbai (n = 7). We also identified frequent transfers between Navi Mumbai and Mumbai (n = 9) (Fig. S3). We also predicted that ciprofloxacin-resistant triple mutant isolates were introduced in Navi Mumbai on at least 12 different occasions. Two ceftriaxone-resistant H58 (4.3.1.2) isolates from our collection were genetically identical to an earlier previously S. Typhi isolate carrying the bla SHV-12 gene described in Eastern India (Kolkata). Our analysis also showed that the ceftriaxone-resistant S. Typhi from Navi Mumbai were phylogenetically distant (8-45 SNPs) from previously documented ceftriaxone-resistant S. Typhi from Mumbai.
Phylogeographic reconstruction of the major non-H58 lineage circulating in Navi Mumbai (genotype 2.2) indicates that its ancestors were circulating in Vellore and were introduced in Navi Mumbai between 2007 and 2014 (Fig. 2). The distribution of isolates and tree topology is consistent with at least three different transfer events of isolates from other locations to Navi Mumbai (2008-2013), followed by ongoing local expansion of a fluoroquinolone non-susceptible clade carrying a single-point mutation (S83F) in QRDR region (gyrA).

Phylogenetic structure of S. Paratyphi A isolates in India
We identified seven different genotypes among the S. Paratyphi A isolates (Fig. 3). All genomes from Navi Mumbai have been assigned to secondary clades 2.3 and 2.4. Genotype 2.3.3 (37.1%; 20/54) was the most common, followed by 2.3 (24.1%; 13/54) and 2.4.2 (24.1%; 13/54). To place the Navi Mumbai isolates in context, we constructed a whole-genome-dated phylogeny including other S. Paratyphi A previously sequenced in India. We estimated that the tMRCA for all S. Paratyphi A in India existed around 1828-1832 (95% highest posterior density interval 1755-1921) and was first introduced to Navi Mumbai around 1992-1998. In addition, we identified at least 11 recent introductions of different genotypes to Navi Mumbai in the last 15 years.

DISCUSSION
Our data provide a historical insight into the composition of the circulating population structure of S. Typhi and S. Paratyphi A in India and contextualization of Navi Mumbai strains on a regional level. The close genetic relatedness of S. Typhi isolated in India, including the antimicrobial-resistant clades, indicates inter-regional transmission and suggests that enteric fever prevention strategies require a coordinated approach between these sites. We identified numerous transfers of H58 S. Typhi organisms from Mumbai to Navi Mumbai. Our observations are consistent with previous global phylo geography analysis that identified India as an important hub for the emergence and spread of antimicrobial-resistant S. Typhi clones (6). In contrast to the H58 analysis, phylogeography reconstruction of the major non-H58 lineage (genotype 2.2) in Navi Mumbai revealed that most typhoid cases resulted from a recent introduction accompa nied by local expansion, rather than long-term persistence. Previous studies have combined spatial-temporal analysis and genomic data to identify local patterns of typhoid transmission in highly endemic areas and provided evidence of intra-household transmission of genetically similar isolates (38)(39)(40)(41)(42). In our study, we found that spatial distance between households of typhoid cases was strongly predictive of the probability of genetic clustering, a finding which was robust to the genetic distance threshold used to define clustering. Up to 5 km, each kilometer was associated with 35% decreased odds of genetic clustering. Viewed another way, isolates from individuals living 1 km from another had a 5.6-fold increased odds of being in a genetic cluster compared with those from individuals living 5 km from one another, suggesting that much of typhoid transmission is local. The spatial scale of transmission likely varies between different communities, such that similar studies are needed elsewhere. Such findings can have important implications for informing decisions about the cluster size of randomized trials of vaccination or water, sanitation, and hygiene (WASH) interventions for control of typhoid. In particular, these results suggest that WASH interventions for typhoid prevention might be most effective if delivered closer to the household or point of water collection rather than on central municipal water and waste.
While typhoid conjugate vaccines have demonstrated a high degree of protection against clinical disease, there remain important questions about whether and to what extent they reduce transmission. The one cluster randomized trial of TCV did not demonstrate indirect effects, though it was not powered to do so (43,44). Genomic analyses may provide a means for assessing the impact of typhoid vaccines on transmis sion. A study from Thailand, which introduced an inactivated typhoid vaccine through a national program between 1977 and 1987, found that genotypes circulating after the vaccine introduction were different than those circulating before, mostly representing sporadic importations rather than local transmission (45). In the present study, we tested  Research Article mBio to see whether there was a reduction in genetic clustering of isolates in clusters receiving TCV and found a non-significant reduction (OR = 0.42, 95% CrI: 0.11-1.43). Given the proximity of clusters, risk of acquisition outside of household (and cluster), and unvacci nated population above 15 years of age, a larger sample size or longer period of observation may be needed to fully assess the transmission effects of vaccination on the S. Typhi population structure. Our findings showed that the S. Paratyphi A population from Navi Mumbai is diverse, and we observe close clustering with isolates from other regions. Although S. Paratyphi A lineages emerged in India around 191 years ago, our analysis identified multiple recent introductions of FQNS isolates in Navi Mumbai. The frequent transfer events and high level of fluoroquinolone resistance demonstrated by the different S. Paratyphi A genotypes are of great concern, especially due to the lack of an S. Paratyphi A vaccine, which limits prevention options. This highlights the importance of genomic surveillance to track the evolution of this pathogen and monitor its transmission.
The management of typhoid is challenging due to the emergence of antibiotic-resist ant S. Typhi strains and their changing resistance profiles (46). Cephalosporins and azithromycin are currently the first-line treatment for enteric fever in the majority of South Asian settings (15). Recent reports have established the emergence of third-gen eration cephalosporin-resistant S. Typhi in India (15,47,48). In our study, ceftriaxone resistance was linked to the acquisition of a IncX3 plasmid carrying the ESBL gene bla  In general, 4.3.1.2 isolates harboring bla SHV-12 were located in independ ent branches of the phylogenetic tree. These data are consistent with independent acquisitions of the resistant plasmid within genotype 4.3.1.2, suggesting that the ceftriaxone-resistant S. Typhi isolates from India have evolved independently from respective geographical locations.
The emergence and expansion of antimicrobial-resistant lineages in S. Typhi are mainly driven by antibiotic usage and selective pressure (49). This is supported by the emergence and regional dominance of H58 (4.3.1.2) QRDR triple mutant in India, which was associated with high fluoroquinolone exposure (16). Previous studies reported that H58 lineage II strains with triple QRDR mutations developed cephalosporin resistance by acquiring resistance plasmids such as IncX3 (bla SHV-12 ) (15). The emergence and dissemination of these high-risk clones in India could lead to large outbreaks and international spread as previously observed in XDR S. Typhi strains carrying the ESBL gene bla CTX-M-15 in Pakistan (14). Our findings of frequent spread of S. Typhi strains between cities and regions in India suggest that a highly antimicrobial-resistant clone arising in one region could swiftly disseminate, underscoring the need for vigorous surveillance and urgent responses to outbreaks of highly antimicrobial-resistant S. Typhi.
While our findings supplement our understanding of enteric fever in an endemic setting, our study is limited by the sample of isolates available for analysis, which was small and reflects sampling of local cases. In addition, the available genomes from India might not have broad representativeness across geographic location or time. These circumstances demonstrate the importance of local laboratory and genomic surveillance in endemic regions such as Navi Mumbai to monitor the ongoing evolution of antimicro bial resistance and the impact of control strategies such as vaccination programs in India.
Our findings show that the control of enteric fever in India and South Asia requires a coordinated strategy, given the inter-regional transmission of different genotypes, suggesting country-wide circulation. The emergence and transmission of high-risk lineages such as QRDR triple mutant and ceftriaxone-resistant organisms in settings with high burden of typhoid call for active surveillance. The implementation of Vi conjugate vaccines appears as an essential measure to control typhoid, but elimination will require immunization be accompanied by improvements in water sanitation and hygiene.

ACKNOWLEDGMENTS
This work was supported by Bill and Melinda Gates Foundation Grant #OPP1169264.

DATA AVAILABILITY
Details and accession numbers of sequence data included in our analysis are provided in the supplement (Tables S1 and S2).

ETHICS APPROVAL
The study was approved by the MGM New Bombay Hospital IRB, Vashi, India; Insti tutional Ethics Committee, Indian Council of Medical Research-National Institute of Cholera and Enteric Diseases (No. A-1/2020-IEC); WHO Research Ethics Review Commit tee (ERC.0002923); and Stanford University IRB (IRB-39627).

ADDITIONAL FILES
The following material is available online.