Whole-genome sequencing of clinical isolates from tuberculosis patients in India: real-world data indicates a high proportion of pre-XDR cases

ABSTRACT Treatment decisions for tuberculosis (TB) in the absence of full drug-susceptibility data can result in amplifying resistance and may compromise treatment outcomes. Genomics of Mycobacterium tuberculosis (M.tb) from clinical samples enables detection of drug resistance to multiple drugs. We performed whole-genome sequencing (WGS) for 600 clinical samples from patients with tuberculosis to identify the drug-resistance profile and mutation spectrum. We documented the reasons reported by clinicians for referral. WGS identified a high proportion (51%) of pre-extensively drug-resistant (pre-XDR) cases followed by multidrug-resistant tuberculosis (MDR-TB) (15.5%). This correlates with the primary reason for referral, as non-response to the first-line treatment (67%) and treatment failure or rifampicin resistance (14%). Multivariate analysis indicated that all young age groups (P < 0.05), male gender (P < 0.05), and Beijing strain (P < 0.01) were significant independent predictors of MDR-TB or MDR-TB+ [pre-extensively drug-resistant tuberculosis (XDR-TB) and XDR-TB]. Ser315Thr (72.5%) in the inhA gene and Ser450Leu in the rpoB gene (65.5%) were the most prevalent mutations, as were resistance-conferring mutations to pyrazinamide (41%) and streptomycin (61.33%). Mutations outside the rifampicin resistance-determining region (RRDR), Ile491Phe and Val170Phe, were seen in 1.3% of cases; disputed mutations in rpoB (Asp435Tyr, His445Asn, His445Leu, and Leu430Pro) were seen in 6% of cases, and mutations to newer drugs such as bedaquiline and linezolid in 1.0% and 7.5% of cases, respectively. This study on clinical samples highlights that there is a high proportion of pre-XDR cases and emerging resistance to newer drugs; ongoing transmission of these strains can cause serious threat to public health; and whole-genome sequencing can effectively identify and support precision medicine for TB. IMPORTANCE The current study is based on real-world data on the TB drug-resistance profile by whole-genome sequencing of 600 clinical samples from patients with TB in India. This study indicates the clinicians’ reasons for sending samples for WGS, which is for difficult-to-treat cases and/or relapse and treatment failure. The study reports a significant proportion of cases with pre-XDR-TB strains that warrant policy makers’ attention. It reflects the current iterative nature of the diagnostic tests under programmatic conditions that leads to delays in appropriate diagnosis and empirical treatment. India had an estimated burden of 2.95 million TB cases in 2020 and 135,000 multidrug-resistant cases. However, WGS profiles of M.tb from India remains disproportionately poorly represented. This study adds a significant body of data on the mutation profiles seen in M.tb isolated from patients with TB in India, mutations outside the RRDR, disputed mutations, and resistance-conferring mutations to newer drugs such as bedaquiline and linezolid.


DNA extraction
DNA extraction was carried out from decontaminated specimens enriched in Bactec MGIT 960 culture tubes (BD Instrument Systems, Sparks, MD, USA) using Qiagen DNeasy UltraClean Microbial Kit (Qiagen, Germany).

WGS
The DNA libraries were prepared using Nextera system XT DNA Library Preparation kit (Illumina, San Diego, CA, USA) following the manufacturer's instructions and were sequenced on the Illumina NovaSeq platform (Illumina)

Bioinformatic analysis
Sequence alignment and mutation calling were done as reported earlier (11,12).The pipeline involves a quality control step based on deduplication, Phred score of >30, and metagenomic binning for Mycobacterium tuberculosis complex.The binned reads which passed the QC metrics were then mapped to H37Rv using both Burrows-Wheeler aligner-maximum exact matches (13) and Genome Analysis Toolkit (14).A final variant calling list in SAMtools variable calling format was generated (15).The variants were annotated using a catalog curated from global catalogs (11).Lineages were identified based on the single-nucleotide polymorphism (SNP) classification scheme as indicated by Homolka et al (16).Phylogenetic analysis was done using the method elaborated by Zade et al. (11).Briefly, SNPs from individual samples were concatenated and subjected to multiple sequence alignment using multiple alignment using Fast Fourier Trans form (https://www.ebi.ac.uk/Tools/msa/mafft/v7.310) (17).The alignment file was then used for phylogenetic reconstruction by the maximum-likelihood method in FastTree software (http://www.microbesonline.org/fasttree/v.2.1.11)(18).The phylogenetic trees were visualised using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/v1.4.4) (19).

Statistical analysis
Quantitative data were entered in MS Excel and later processed using SPSS version 27.0.We used χ 2 test to compare proportions.Ages were represented as mean with standard deviation; Mann-Whitney U test was applied for comparison of ages between genders.Drug-resistance patterns are displayed graphically.We performed univariate and multivariate logistic regression analyses to identify risk factors associated with MDR-TB without variable selection unless otherwise stated in the results.For univaria ble and multivariable analyses, considering clinical significance, we grouped using the following categories: non-MDR-TB [drug susceptible (DS) and drug resistant (DR)] and for MDR-TB/MDR-TB+ [MDR-TB, pre-extensively drug-resistant tuberculosis (pre-XDR-TB) and XDR-TB].

Reasons for referral
Among the major reasons for patient referral for WGS, of 21 providers, 10 (private) mentioned multiple episodes of TB and non-response to the first-line anti-TB treatment, whereas the remaining 11 NTEP providers mentioned rifampicin resistance (RR) on CBNAAT/Xpert assay (3), family contacts of MDR-TB cases (3), non-response to first-line anti-TB treatment (3), and TB recurrence (2).

Characteristics of the patients
WGS was successfully performed for all isolates from 600 patients.Of the 600 patients, 309 (52%) were men; the median age of the patients was 29 years (interquartile range 21-45).Statistically significant differences (P < 0.0001) were seen in the mean ages between males (39.6 yrs) and females (29.7 years).Samples from 30% patients did not show acid-fast bacilli on microscopy, and this was seen particularly for samples from women than men (34% vs 26%, P = 0.033) (Table 1).WGS identified a high proportion of pre-XDR-TB (50.83%), followed by MDR-TB (15.5%), with nearly equivalent proportions among men and women.More women than men were found to have drug-susceptible TB (23% vs 16%, P = 0.02757) (Table 1).
We applied the classification system of Homolka et al. (16) for identification of lineages.The lineage informative SNP sets by this classification are largely compatible and unambiguously report the main lineages and sub-lineages when compared with other classification systems (Table S2).
The predominant lineage was the Beijing genotype (39.5%) followed by Delhi-Central Asian Strain (CAS) (36.66%) and East African Indian (EAI) (14.50%) (Table 4).Differences in lineage distribution between men and women were significant for Delhi-CAS (P = 0.024) and EAI (P = 0.032), but the distribution of all the lineages combined across the genders was not significant (P = 0.08).Significant differences were also seen in lineages across age groups (P = 0.009), with a higher proportion of Beijing lineage between 0 and 14 years (58%) and between 15 and 35 years (44.76%).Differences in drug resistance across lineages were also highly significant, with differences seen in Beijing lineage for DS, MDR, pre-extensively drug-resistant (pre-XDR) and XDR, and also in Delhi-CAS across the same categories.The pre-XDR-TB cases were mainly detected in isolates of the Beijing lineage (75.52%), followed by Delhi-CAS (40.45%).EAI lineage also exhibited significant differences with a lower proportion of pre-XDR and XDR isolates (Table 4; Fig. 4).
The univariate analysis indicated that younger age groups (<14, 15-35, and 36-56 years) and the Beijing strain were significant predictors of MDR-TB or MDR-TB+ (pre-XDR-TB and XDR-TB) (Table 5).However, multivariate analysis indicated that all young age groups, male gender, and Beijing genotype were significant independent predictors of MDR-TB or MDR-TB+ (Table 6).In other words, the multivariate analysis showed that children <14 years of age had nearly five times higher risk of developing or getting infected with MDR-TB/PreXDR or XDR-TB when compared with the age group of >57 years and young age groups (the age groups of 15-35 and 36-56 years have 2.5 times higher risk as compared to the age group of >57 years).Those infected with strains   that belong to the Beijing lineage had nearly 44 times higher risk of MDR-TB/pre-XDR or XDR-TB than strains belonging to any other lineage.The classification table indicated that overall, 74.5% of cases (94.6% of MDR-TB, pre-XDR, and XDR) were correctly predicted through this model.

Mutations
Maximum mutations were observed in regions known to encode resistance for RMP (16.72%),INH (20.48%),PZA (9.09%), SM (12.99%), and EMB (15.84%) (Fig. 5).Of the first-line drugs, rifampicin resistance was conferred predominantly by the Ser450Leu mutation (393 of 449, 87.5%), of which 74% (218 of 393) were in isolates of the Beijing lineage.Mutations outside the rifampicin resistance-determining region (RRDR) were seen in eight isolates (Ile491Phe and Ile170Val); disputed mutations in the rpoB were seen in 34 isolates (Table 7).The Ser315Thr katG mutation, prevalent at an average of 19% globally (20,21) and associated with moderate-to-high-level resistance in isoniazid, was seen in 72.02% (435 of 600) of the isolates reported here.We also report the presence of both the n.C-15T in the fabG1 promoter (22) and the Ser315Thr katG mutations in 17% (100 of 600) of the isolates.Concurrent Ser315Thr katG mutation and ndh Arg268His mutation were seen in 1.5% of the isolates (Arg268His ndh mutation was observed in a total of 15 isolates).In total, 154 isolates exhibited a varied compound mutation profile, suggesting a high proportion of combination mutations in genes known to encode resistance to INH.Resistance to pyrazinamide was predominantly conferred by the Gly132Ala and Leu27Pro mutations in the pncA gene (73 and 55 isolates, respectively), of which both mutations were detected in four isolates.A combination of Gly132Ala with other mutations in the pncA gene (Thr47Ala, Phe94Leu, Asp12Glu, Gln10Pro, and Rv2043c intergenic c. 2289252T>C) was seen in seven isolates.Mutations known to confer resistance to EMB were seen in ~65% (389 of 600) of the isolates, with the highest frequency being at the 306 position in the embB gene.The embB Met306Val mutation was found in 217 of the 600 isolates followed by embB Gln497Arg mutation in 101 of the 600 isolates.Resistance to SM was conferred by mutations in 61% (368 of 600) of the isolates, with the Lys43Arg in rpsl gene seen in 338 of the 368 isolates (91.8%), of which 218 isolates were of the Beijing lineage (64.49%) as compared to 120 isolates in all other lineages combined.We report a lower frequency of rrs mutations conferring SM resistance (6.52%, 24 of 383) compared to a previous report (23).Similarly, we observed a lower frequency of gidB gene mutations (3 of 383) as compared to a previous study (24).Twenty-four isolates had both Asp94Gly and Ala90 Val together, and a total of 65 isolates had more than 1 mutation, with 11 isolates having mutations in both the both gyrA and gyrase B (gyrB) genes.From the isolates which had gyrB mutations, four had Arg446Cys; three had Asn499Thr; and one isolate each having Thr500Asn, Ala504Thr, Asn499Asp, and Thr500Pro mutations.Since mutations in gyrB typically are seen co-occurring with gyrA mutations, it becomes difficult to delineate their exact contribution to resistance and have been associated with lower sensitivity and specificity to FQ resistance (Table 7) (25).
Mutations in the rrs and eis genes are associated with resistance to the SLIDs [amikacin (AMK), KAN, and capreomycin (CAP)], though other genes have also been implicated, such as whiB7, vapC21, and tllyA, their effects still require explanation with larger data sets (26).AMK resistance was conferred in ~11% (71 of 600) of the isolates (1401A>G in the rrs gene, associated with cross resistance to KAN and CAP), of which  T460C/Cys154Arg mutation in the rlc gene (27)(28)(29), was seen in 35 isolates with a higher frequency in the Beijing lineage (57.14%).We also report the mutation in the rrl gene known to be associated with a 8-to 50-fold increase in MIC (30) in 10 isolates in the current report.
SNP-based phylogenetic analysis suggests that, though the Beijing lineage is predominant in the study population, resistance in other lineages like EAI and Delhi CAS was also observed (Fig. 6; Fig. S1a through e).

DISCUSSION
Here we report real-world data on the anti-TB drug-resistance profiles identified by WGS performed on M.tb isolated from confirmed TB patients from various geographical regions of India.There is an accumulating evidence that restricted access to drug-resist ance diagnostics is fueling delays in appropriate care (31).In the absence of drug-sus ceptibility profile, patients are usually placed on therapy, which may contain a few drugs to which the M.tb strain is already resistant, and such therapy ultimately fails because of the risk of amplification of resistance (32).Such practices put communities at risk of exposure to increasingly resistant M.tb.Delays in diagnosis and an effective treatment of MDR-TB in the private sector are really a cause of concern (33).Despite a well-established national-level TB program that exists in India for more than three decades, the fact that the private sector plays a central role in TB management cannot be neglected.Interest ingly, in the present study, we found that these patients were referred by the private practitioners for the reasons of non-response to the first-line treatment and/or multiple episodes of TB.Similarly, the NTEP providers reported RR on CBNAAT or recurrence as the main reasons for referral.The alarmingly high proportion of pre-XDR in these referrals and the existing cure rates for MDR and XDR-TB in India, which were at a level of 57% and 55%, respectively, in 2019 (5), are a testament to the fact that we need to address these issues on high priority.Studies by Atre et al. (33) in Pune and Atre et al. (34) in Mumbai reported that 72% and 62% of MDR-TB cases, respectively, belonged to the young and productive age group (15-35 years), hinting toward an ongoing transmission of drug-resistant strains in this region.
The data presented here also indicate similar findings where young age is associated with MDR/PreXDR/XDR TB.Moreover, in the present study, it was observed that even  the middle-age group had a higher risk of being infected with highly resistant strains as compared to the age group of >57 years.These observations corroborate with the findings from a systematic review and meta-analysis (35) in Europe, which indicated that MDR-TB patients were younger than 65 years (odds ratio 2.53, 95% confidence interval (CI) 1.74 to 4.83).Furthermore, in the present study, we found that children <14 years had a higher risk of getting affected by the Beijing strain as compared to other strains.This observation was consistent with a study in Peru where Beijing strains were found Frequency of mutations in 600 cases across the 18 antibiotics.36).This study further shows male gender as a significant predictor of MDR/Pre-XDR/XDR-TB, which is similar to the observations in the meta-analysis ( 35), but, in contrast to previous studies in Maharashtra state of India, shows that female gender was a significant predictor of MDR-TB (33,34).Despite the high burden of TB that is prevalent in India, representation of the genomic data is scarce.Garnering understanding of mutations especially for newer drugs and their resistance profiles in different geographic settings is critical, as is the need to constantly update databases.Here we report the presence of resistance-conferring mutations to RIF outside the RRDR region of the rpoB gene, which are not covered by rapid diagnostic tests (37) and are missed even by phenotypic DST (38,39).This is a worrying trend and may lead to inaccurate diagnosis and underestimation of rifampicin resistance in the population (40).The mutation profile conferring resistance to INH showed high proportion of concurrent mutations and mutation conferring resistance to high dose of isoniazid (41,42), further undermining the use of isoniazid in these cases.Double mutations of inhA and katG reported here are also associated with increased risk at baseline for CAP resistance, XDR, and acquired FQ resistance as compared to mutations in katG only (43).We also detected multiple mutations conferring resistance to ethambutol were seen in 23 (Met306Val and Gln497Arg mutations in the embB gene) and 36 (Met306 in varied combinations) isolates, with a total of 72 isolates presenting >1 mutation.
Another concerning observation was the detection of mutations in the pncA gene in several isolates.Mutations in the pncA gene are shown to contribute to almost 72%-98% of resistance to PZA, along with known discordance between genotypic and phenotypic DST (44)(45)(46), due to the unreliable phenotypic DST of PZA (20,47).The inclusion of PZA in several clinical trial regimens (Standardised Treatment Regimen of Anti-tuberculosis Drugs for Patients with Multidrug-resistant Tuberculosis [STREAM] and simplifying treatment for drug sensitive and-resistant tuberculosis [SIMPLICI TB]) and its combinatorial action with other drugs highlight the necessity of effectively determining resistance to it, especially since there is tripling of the mortality risk when individuals with resistance to PZA are put on a regimen with this component (48).
One of the most common drug resistances encountered in M.tb isolates worldwide is toward SM (49)(50)(51).SM is still used in several countries including India to complete the longer treatment regimen for MDR-TB and as a substitute for AMK.It is interesting to note that mutants Lys43Arg and Lys88Arg in rpsl exhibit high-level resistance, whereas mutants in rrs have low effect on MICs (52).Double mutations in gyrA and gyrB conferring resistance to FQ was observed in 11, which is in line with the 1%-3% range in previous reports (53)(54)(55).
FQ resistance is a major determinant of treatment failure in MDR patients.The referral patients for WGS in this study have shown a very high proportion of pre-XDR with FQ resistance causing mutations seen in 61% isolates.Global reports indicate roughly 60%-90% mutations being found in the quinolone determining resistance region codons 88-94 (53) of the gyrase A (gyrA) gene.In our study, 63% showed Asp94Gly mutation, and 25% had Ala90Val mutations.The presence of Asp94 mutations is sufficiently indicative of resistance; however, other mutations may result in MICs that may not be detected by the existing culture-based tests (56).We report a much higher FQ resistance (60%) as compared to the 21% FQ resistance reported in the national drug-resistance survey (57).Since our samples were predominantly from the individuals with suspected MDR cases, there could be a sample bias.A recent study in Mumbai showed a high proportion of FQ resistance among multidrug-resistant tuberculosis driven by dominant Lineage 2 M.tb clones (58), which includes the Beijing sub-lineage.Our study also reflects a high proportion of Beijing lineage and FQ resistance.A high level of FQ resistance is ubiquitous and is thus a cause of concern for India.
Another issue which is of concern is the presence of heteroresistance, which can cause treatment failures.Heteroresistance affects the resistance detection ability of rapid molecular tests, underestimating the true burden of FQ resistance.Here we reported that 60% (219 of 364) of the isolates with resistance-conferring mutations to FQs exhibited heteroresistance, with NGS reads ranging from 20% to 98% of the mutant alleles.In a previous study on five patients who were infected with ofloxacin-resistant TB and were treatment naïve with FQs, each carried mixed populations of resistant and sensitive bacilli (56).We also reported mutations in the rrs gene conferring resistance to KAN and CAP.Though these drugs are no longer used in treatment regimens, mutations reported in the rrs gene mutations are known to be significantly associated with XDR-TB (59).
Finally, a crucial finding in this report is the presence of mutations conferring resistance to BDQ and LZD.Both drugs are part of the shorter BPaL regimen for treating MDR-TB patients.The atpE Ala63Pro mutation reported here has been documented to cause a four-to eightfold increase in BDQ MICs with the DR-TB isolates (60,61), which when present with mutations in the Rv0678 gene may confer high-level BDQ resistance with an increased MIC (MIC of 2.0 and >8.0 mg/L in MGIT) (61).
The phylogenetic reconstruction using whole-genome SNPs of isolates reported here suggests that the accumulation of resistant mutations is an independent event.We could not establish an evolutionary transitional relationship between DS, DR, MDR, pre-XDR, and XDR strains in the study sample data set.Phylogenetic reconstruction from the DR, MDR, pre-XDR, and XDR samples in our data set showed lineage-based clustering, suggesting unbiased propensity of accumulating resistance imparting mutations among, Beijing, Delhi-CAS, EAI, and Clade 1.We speculate that the increasing incidences of MDR, pre-XDR, and XDR strains in different lineages could be due to the initiation of an antibiotic treatment regime based on an inconclusive diagnosis.
Implementation of WGS on clinical isolates of TB can have a positive impact on treatment changes and patient outcomes.Cox et al. (62), in a study of TB patients in South Africa, indicated that 46% (385 of 834) may have benefited from reduced drug dosage or removing ineffective drugs; a further 22% (187 of 834) may have benefited from effective adjusted regimens; and 35% (153) of the 440 patients eligible for longer regimen could be prescribed an effective shorter regimen.The impact on the outcome on accurate therapy guided by TB-WGS was also observed by Zürcher et al. (63), wherein from seven TB high-burden countries it was reported that underdiagnosis of drug resistance resulted in inappropriate treatment and higher mortality with an adjusted odds ratio of 4.92 (95% CI 2.47-9.78)among undertreated patients, compared with appropriately treated patients.
In that context, it will be important to study what strategies have been adapted by the clinicians after receiving the WGS results and how those impacted further treatment outcomes.Also studying the utility and cost-effectiveness of implementing WGS in an Indian setting would be a key step.In our opinion, the NTEP should take cognizance of these findings and initiate new studies to generate evidence for scale-up of WGS in near future.

Conclusion
In conclusion, there are relatively few studies providing information of the mutation spectrum seen in M.tb from India, a country which contributes to almost one-third of TB cases.Though this study has limitations with respect to sampling bias, availability of phenotypic data, and clinical outcomes, to the best of our knowledge, this is one of the largest report to date which adds a significant body of information on the drug-resistant mutations from clinical referral samples in India and brings to attention the issue of emerging resistance to newer drugs such as bedaquiline.It underscores the need for training both care providers for rational use of anti-TB drugs and the patients.
42 were associated with the Beijing lineage.Mutations in the eis gene were seen in 20 isolates, with 11 having the 2715346G>A (−14G>A) and 9 with the 2715342C>T (−10C>T) mutations.The eis gene mutations in the promoter regions (−10C>T and −14G>A) seen in 20 isolates (3.3%) are known to result in a low-level minimum inhibitory concentration (MIC) increase which can overlap with the WHO-based critical concentra tions and often result in discordance with phenotypic assays.However, low resistance to BDQ was conferred by the Ala63Pro mutation in the atpE gene in three isolates and by mutations in the Rv0678 (known to confer cross resistance to CFZ) in seven isolates.LZD resistance, predominantly encoded by the

FIG 3
FIG 3 Distribution of lineages across age groups.

FIG 4
FIG4 Frequency of drug-resistance categories across lineages in 600 cases.
children than are non-Beijing strains (

FIG 6
FIG 6 Phylogenetic analysis of 600 samples using concatenated SNP sequences.The lineages Delhi-CAS, EAI, Beijing, and Clade 1 are pink, purple, blue, and black, respectively.The phylogenetic reconstruction suggests that the clustering does not support the lineage-based clades, possibly due to the lineage-inde pendent resistance overlapping SNPs other than phylogenetic markers.

TABLE 1
Characteristics of study participants CharacteristicsTotal, n (%) Women, n (%) Men, n (%) P value a Results analyzed with data available for 598 of 600 participants.bResults analyzed with data available for 551 of 600 samples.c P < 0.05; χ 2 tests, significant difference between women and men.d P < 0.01; χ 2 test, highly significant difference between women and men.

TABLE 4
Frequency of lineages across gender, age, and drug-resistance status c

TABLE 7
Spectrum of common mutations lineage wise, disputed mutations, mutations outside RRDR