The Evolution and Transmission Dynamics of Multidrug-Resistant Tuberculosis in an Isolated High-Plateau Population of Tibet, China

Emerging isoniazid resistance in the 1970s allowed M. tuberculosis strains to spread and form into large multidrug-resistant tuberculosis clusters in the isolated plateau of Tibet, China. The epidemic was driven by the high risk of transmission as well as the potential of acquiring further drug resistance from isoniazid-resistant strains. ABSTRACT On the Tibetan Plateau, most tuberculosis is caused by indigenous Mycobacterium tuberculosis strains with a monophyletic structure and high-level drug resistance. This study investigated the emergence, evolution, and transmission dynamics of multidrug-resistant tuberculosis (MDR-TB) in Tibet. The whole-genome sequences of 576 clinical strains from Tibet were analyzed with the TB-profiler tool to identify drug-resistance mutations. The evolution of the drug resistance was then inferred based on maximum-likelihood phylogeny and dated trees that traced the serial acquisition of mutations conferring resistance to different drugs. Among the 576 clinical M. tuberculosis strains, 346 (60.1%) carried at least 1 resistance-conferring mutation and 231 (40.1%) were MDR-TB. Using a pairwise distance of 50 single nucleotide polymorphisms (SNPs), most strains (89.9%, 518/576) were phylogenetically separated into 50 long-term transmission clusters. Eleven large drug-resistant clusters contained 76.1% (176/231) of the local multidrug-resistant strains. A total of 85.2% of the isoniazid-resistant strains were highly transmitted with an average of 6.6 cases per cluster, of which most shared the mutation KatG Ser315Thr. A lower proportion (71.6%) of multidrug-resistant strains were transmitted, with an average cluster size of 2.9 cases. The isoniazid-resistant clusters appear to have undergone substantial bacterial population growth in the 1970s to 1990s and then subsequently accumulated multiple rifampicin-resistance mutations and caused the current local MDR-TB burden. These findings highlight the importance of detecting and curing isoniazid-resistant strains to prevent the emergence of endemic MDR-TB. IMPORTANCE Emerging isoniazid resistance in the 1970s allowed M. tuberculosis strains to spread and form into large multidrug-resistant tuberculosis clusters in the isolated plateau of Tibet, China. The epidemic was driven by the high risk of transmission as well as the potential of acquiring further drug resistance from isoniazid-resistant strains. Eleven large drug-resistant clusters consisted of the majority of local multidrug-resistant cases. Among the clusters, isoniazid resistance overwhelmingly evolved before all the other resistance types. A large bacterial population growth of isoniazid-resistant clusters occurred between 1970s and 1990s, which subsequently accumulated rifampicin-resistance-conferring mutations in parallel and accounted for the local multidrug-resistant tuberculosis burden. The results of our study indicate that it may be possible to restrict MDR-TB evolution and dissemination by prioritizing screening for isoniazid (INH)-resistant TB strains before they become MDR-TB and by adopting measures that can limit their transmission.

D rug resistance in Mycobacterium tuberculosis represents a major challenge for tuberculosis (TB) control programs. In 2020, there were about half a million new rifampicin (RIF)-resistant TB cases globally, of which 78% were also resistant to isoniazid (INH) and therefore to multidrug-resistant TB (MDR-TB) (1). In 2019, China had the second-largest burden of global MDR-TB (14%), exceeded only by India (27%). China has approximately 65,000 cases of MDR-TB annually, with high rates among both newly diagnosed (7.1%) and retreated (23%) patients (1). Tibet, also known as Xizang, is a provincial autonomous region of China that has greater burdens of both TB and MDR-TB than other areas of China. In 2014, the estimated prevalence of TB in Tibet was 758/100,000 population (2), and in the provincial capital, Lhasa, 21% of new and 57% of retreated cases had MDR-TB (3).
Tibet is relatively isolated and historically has had limited population exchange with other areas of China, and our recent study showed that local selection shaped the predominant clades of M. tuberculosis circulating on the Tibetan Plateau (4). Long-term transmission and outbreaks caused by drug-resistant TB strains have been reported in Russia (5), South Africa (6), and other countries but have not yet been described in China. While concern has generally concentrated on the increased risk of MDR-TB transmission, a recent study addressed the expansion potential of strains before they develop resistance to second-line drugs (7). To limit the emergence of MDR-TB outbreak strains, it may be useful to analyze the process through which M. tuberculosis evolves resistance to multiple drugs and to clarify the transmission risk during this process.
To trace the evolution of drug resistance and investigate the factors responsible for the high burden of MDR-TB in Tibet, we analyzed the whole-genome sequences of drug-resistant M. tuberculosis to identify the mutations conferring resistance to antituberculosis drugs and to reconstruct the evolution of drug resistance in Tibet based on the phylogenetic structure of the local clades. The results of our study indicate that it may be possible to restrict MDR-TB evolution and dissemination by prioritizing screening for INH-resistant TB strains before they become MDR-TB and by adopting measures that can limit their transmission.

RESULTS
Basic genetic characteristics and cluster selection. The whole-genome sequences of the 576 M. tuberculosis clinical isolates had an average depth of 79-fold (range, 22-to 137-fold). A total of 439,139 single nucleotide polymorphisms (SNPs) were detected at 16,156 sites, with an average of 762 SNPs in each strain, compared with the H37Rv reference genome. From the genotypic resistance profiles, 230 (39.9%) isolates were pansusceptible while the other 346 (60.1%) strains were resistant to at least one anti-TB drug, including 231 (40.1%) that were MDR-TB (Table 1; see Fig. S1 in the supplemental material). The drugresistant TB strains carried an average of 4.0 (95% confidence interval [CI], 3.8 to 4.2) resistance-conferring mutations. The ancillary information related to the strains was shown in Table S1 in the supplemental material.
To identify the transmission clusters of drug-resistant TB strains, we first calculated the pairwise genetic distances between strains. The Tibetan strains showed no clear separation but rather a gradual decrease in the distribution of the SNP differences between strains (see Fig. S2 in the supplemental material). Using 5 SNPs or 12 SNPs as thresholds, 48.8% (281/576) and 66.1% (381/576) of the strains, respectively, were genetically clustered, suggesting a high level of recent transmission. The majority (89.9%, 518/576) of the strains were separated from another strain by less than a 50-SNP distance and were confirmed as phylogenetically clustered. These genomes separated into 50 long time frame clusters that included 2 to 85 strains each.
To capture the process of drug-resistance evolution during the long-term transmission of M. tuberculosis, we selected the 11 largest clusters that contained at least 10 strains and at least 3 phylogenetically consecutive resistant strains for further analysis (Fig. 1). They included seven clusters of the ancient Beijing sublineage and four clusters of the modern Beijing sublineage. The clusters contained 372 strains, including 363 strains within the 50-SNP distance. The phylogenic structure with the corresponding drug resistance pro- files is shown in Fig. 2. These strains, which all belonged to lineage 2, accounted for 64.6% (372/576) of the total isolates and 76.2% (176/231) of the MDR-TB isolates ( Table 1).
A total of 25 different RIF-resistance mutations were identified in the clusters, of which 15 were found in at least 2 clusters, while the others were present in just one cluster (see Table S2 in the supplemental material). On average, the clusters carried 7.6 (range, 4 to 16) different RIF-resistance mutations. The number of different RpoB mutations was correlated with the number of RIF-resistant strains in each cluster, with an R 2 coefficient of 0.8883. The correlation was even stronger when we compared the number of different RpoB mutations with the number of drug-resistant (R 2 = 0.9120) or INH-resistant strains (R 2 = 0.9026) (see Fig. S3 in the supplemental material) in each cluster. Compensatory mutations in rpoABC genes were detected in 48 rifampicin-resistant strains belonging to 10 of the 11 clusters. Most compensatory mutations were present in only a single strain or in small subclusters of 2 to 3 strains, but the RpoB Gln409Arg mutation was found in a four-strain subcluster of cluster L23_12, and RpoC Asn416Ser and RpoA Gly31Ser were found in six-strain subclusters of clusters L20_07 and L20_13, respectively (Table S2).
Transmitted and acquired resistance. By analyzing the nodes in the phylogenetic tree where the resistance mutations were inferred to have occurred, we found that the evolution of resistance mutations could be divided into either transmitted or acquired resistance. Transmitted resistance is when the mutation occurred on the internal nodes such that adjacent strains carry the same mutation. In contrast, acquired resistance is when the mutation is present only in a single strain on a terminal branch tip. The proportions of transmitted resistance were highest among strains resistant to para-aminosalisylic acid (PAS) (94.5% [69/73]), followed by strains resistant to streptomycin (SM) (89.3% [158/177]) and INH (85.2% [178/209]). These resistant strains evolved from 7, 20, and 27 ancestral mutant clones, respectively, and the average transmitted cluster sizes were 9.9, 7.9, and 6.6 strains, respectively (Fig. 4). The proportions of transmitted resistance among strains with resistance to RIF, ethambutol, and pyrazinamide were 72.1% (178/247), 60.7% (74/122), and 60.0% (45/75), respectively, and the sizes of the transmitted resistant clusters were generally smaller than the transmitted clusters of INH-resistant strains (2.6 to 3.4 versus 6.6). Similarly, while the majority of MDR-TB strains (71.6% [126/176]) were also transmitted from ancestral clones, they were in relatively small clusters with an average size of 2.9.
Pathways of multidrug resistance evolution. We observed that mutations conferring resistance to PAS, SM, and INH occurred before resistance to other drugs, which likely explains their larger clusters and higher rates of transmitted resistance. To trace the evolution of resistance to pairs of drugs, we counted the events where resistance to drug A occurred before resistance to drug B and vice versa (Fig. 5). INH resistance overwhelmingly evolved before all other drug resistances. Among the 176 MDR-TB strains, 126 (71.6%) strains were first resistant to INH and then subsequently resistant to RIF (Fig. 5b). Only 8 (4.5%) MDR-TB strains acquired RIF resistance before INH resistance. The other 42 strains appear to have acquired resistance to both drugs at about the same time because it was impossible to determine the order in which resistance developed (Fig. 5c).
We used the method proposed by Muzondiwa et al. (8) to assess the risk of different orders of paired resistance (Fig. 5d). The correlations between resistance to paired anti-TB drugs were all strong with a high linkage disequilibrium (LD) over 0.70. Based on Levin's attributed risk, the risk of an INH-resistant mutant accumulating resistance to other drugs was greater than the opposite order.
Except for two strains that acquired ethambutol resistance before INH resistance and eight MDR strains that first developed RIF resistance, no other drug resistance occurred before INH resistance in the strains resistant to multiple drugs. Of the strains that were resistant to INH  To assess M. tuberculosis transmission by year in relation to resistance acquisition, we used the Bayesian skyline plots to reconstruct the expansion of the bacterial populations in the six biggest clusters (Fig. 6) and then mapped the estimated dates of where resistance to INH and RIF evolved onto the nodes. From these plots, it appears that the strains acquired INH resistance in the 1970s, which was followed by a period of exponential growth of their bacterial populations. By the late 1990s, however, when the clusters had developed additional resistance to RIF, the bacterial populations decreased in all except the largest cluster, Cluster L20_13, which had the longest time span with accumulated RIF resistance (Fig. 2b).

DISCUSSION
In this study of clinical M. tuberculosis isolates from Tibet, China, we characterized the evolutionary trajectories of drug resistance and demonstrated that INH resistance evolved in the 1970s, before other drug resistance, and was followed by a rapid expansion of the INH-resistant bacterial population. During the 1970s to 1990s, before RIF was widely used in Tibet, the INH-resistant strains were unlikely to be cured with the triple-drug therapy Because Tibet is relatively isolated, we could previously document the local adaptation of M. tuberculosis strains (4), and here, we could trace the evolution and transmission of drug-resistant tuberculosis over a fairly long time frame. We estimated that the proportion of transmitted resistance among MDR-TB was 71.6%, which is higher than the value calculated using pairwise SNP distances of 5 or 12 SNPs. These lower SNP thresholds are often used to define cases arising from direct or recent transmission (9, 10), but they will not capture the evolution and spread of drug-resistant M. tuberculosis strains over a longer time period. For example, a study from South Africa traced the longterm spread of an extensively drug-resistant TB (XDR-TB) strain into a highly monophyletic A recent study, similar to ours, traced the cross-continental expansion of the modern Beijing lineage W148 clade, whose members differed by a median of 32 SNPs. After nearly 30 years of initial INH resistance, RIF resistance emerged in the W148 clade around 1991, with at least 66 independent RIF resistance events, followed by the subsequent development of XDR and pre-XDR strains (7,11). The acquisition of INH resistance prior to RIF resistance has also been demonstrated in other successful MDR clades (12,13), most often with the KatG Ser315Thr mutation that confers highlevel INH resistance with minimal fitness costs (14). In vitro experiments have shown that M. tuberculosis spontaneously acquires INH resistance mutations more frequently than RIF resistance mutations (15).
Although it was thought that INH resistance could be cured routinely with the standard first-line therapy, a meta-analysis found high proportions of unfavorable outcomes among INH-resistant TB patients treated with the first-line regimen of INH, RIF, pyrazinamide, and ethambutol (HRZE), with frequencies of failure, relapse, and acquired multidrug resistance of 11%, 10%, and 8%, respectively (16). Worldwide, INH-resistant strains were found in 7.4% of new and 11.4% of retreated TB patients (17). Improved detection and effective treatment of these INH-resistant strains could help restrict the future evolution of MDR-TB.
This study had some limitations. We selected only the major clusters to illustrate the evolutionary pathways of drug resistance over a long time frame, and the resistance rates and the proportions of resistance transmission may be different if the analysis included the whole data set or all TB in Tibet. In addition, the passive case finding and limited diagnosis capacity in Tibet during the sampling period may have missed a substantial, but unknown proportion of incident TB. As a result, although we found high proportions of clustering and inferred the transmission of drug-resistant strains, it is likely that these data underestimate the true rates of clustering and MDR-TB transmission in Tibet.
In conclusion, the M. tuberculosis clusters emerged and expanded after acquiring INH resistance and became the precursors of the current MDR-TB burden in Tibet. Early detection of INH-resistant TB and full drug resistance profiles are essential to avoid further amplification and transmission of drug resistance.

MATERIALS AND METHODS
Study site and sampling. Tibet is located on the world's highest plateau, with an average altitude of over 4,000 meters above sea level. Its large expanse of 1.2 million square kilometers constitutes nearly an eighth of China's entire landmass but contains only 3.3 million inhabitants (,3 people per square kilometer). In a previous epidemiological study (18), M. tuberculosis isolates were collected from 576 patients presenting to 7 municipal-level clinics in Tibet during the years 2006, 2009, and 2010. Smear-positive patients were sequentially enrolled, and their treatment-naive sputum specimens were sent to the municipal central laboratory for culturing. The 576 isolates that were available and included in the study represent an estimated 40% of the smear-positive pulmonary TB patients diagnosed during the three sampling years. Only one strain per patient was included in our data set, except for second isolates from patients beginning a retreatment regimen. These 576 isolates were whole-genome sequenced, the genotyping showed that the majority belonged to lineage 2, and isolates were separated into several highly monophyletic clades (4). In this study, we selected the drug-resistant clades to investigate the emergence of resistance and the stepwise evolution into multidrug resistance.
Whole-genome sequencing and SNP calling. Genomic DNA of the M. tuberculosis isolates was extracted with the cetyltrimethylammonium bromide method (19). A 300-base-pair double-ended DNA library of each isolate was sequenced on the Illumina HiSeq 2500 platform with an expected depth of 100-fold. After the removal of low-quality reads with Trimmomatic (v0.39), the reads were aligned to the H37Rv reference genome (accession no. AL123456.3) with BWA-MEM (v0.6) and single nucleotide polymorphisms (SNPs) were called with the GATK tool (HaplotypeCaller v4.1.4.1). SNPs in the repetitive regions (e.g., PPE/PE-PGRS family genes) and in drug-resistant genes were removed with VCFtools. Confidential fixed mutations with a frequency of $75% and depth of $10 were kept and converted into a FASTA-format alignment for phylogenic reconstruction.
Genetic distance was calculated by comparing the sequences of each pair of strains to generate an SNP matrix. Genomic clusters were identified using different thresholds of pairwise SNP distances. Strains separated by 5 or 12 SNPs were considered "close clusters" that emerged over short-term windows of several years (9). Strains within a 50-SNP distance were considered "remote clusters" that formed over several decades that include the entire era of anti-TB chemotherapy beginning in the 1950s (20). The sequencing data were deposited in the Sequence Read Archive of the National Center for Biotechnology Information (accession no. PRJNA656167).
Phylogeny construction and phylodynamic analysis. The identified SNPs were used to construct a phylogeny tree with RAxML software (21), employing the maximum-likelihood method with a general time-reversible model of nucleotide substitution and main node values over 0.7 after 500 bootstrapping trees. The aligned sequences of the major drug-resistant clades were also analyzed with BEAST 2 software (v2.6.1) (22) to construct a dated phylogeny based on a molecular clock with an uncorrelated log-normal distribution and a tree generated prior with coalescent Bayesian skyline analysis. The molecular clock was set at 1.14 (0.49 to 1.80) Â 10 27 substitutions per site per year (0.50 [0.22 to 0.79] SNPs per genomeÁper year), which is a mutation rate reported elsewhere for lineage 2 strains (23). Bayesian skyline plots for the effective population size were reconstructed in Tracer software (24), with the age of the most recent common ancestor (MRCA) node being the height of the tree. The phylogenies were visualized with iTOL (v6; https://itol.embl.de/).
Determination of transmitted and acquired genotypic resistance. Known drug resistance mutations with a frequency of $5% in the sequence alignments were identified using the TB-profiler tool (v4.4.0) (25), based on a published database of mutations conferring resistance to 14 anti-TB drugs (26). Compensatory mutations were identified based on a previously reported database (27). Transmitted and acquired resistance were determined by mapping the resistance mutations onto a phylogenetic tree (5) using the following rules: mutations shared by all strains on a branch were considered to be present at the MRCA of the branch and termed "transmitted resistance," and these strains were considered to be phylogenetically clustered; and mutations on single branches and not present in neighboring branches were assumed to have occurred at the terminal tip and were considered "acquired resistance." Ethics approval and consent to participate. Ethical approval was obtained from the Ethical Review Board of the Chinese Center for Disease Control and Prevention (no. ICDC-2019010).
Data availability. Sequencing reads have been submitted to the NCBI or The European Bioinformatics Institute under study accession number PRJNA656167. In-house scripts developed by Phelan et al. (25) were used to automate the processing of raw sequencing reads, to generate an SNP distance matrix, and to identify genetic markers for drug resistance. The scripts have been deposited previously at GitHub (https://github.com/ pathogenseq/fastq2matrix; https://github.com/jodyphelan/TBProfiler).