Lineage classification and antitubercular drug resistance surveillance of Mycobacterium tuberculosis by whole-genome sequencing in Southern India

ABSTRACT Whole-genome sequencing has created a revolution in tuberculosis management by providing a comprehensive picture of the various genetic polymorphisms with unprecedented accuracy. Studies mapping genomic heterogeneity in clinical isolates of Mycobacterium tuberculosis using a whole-genome sequencing approach from high tuberculosis burden countries are underrepresented. We report whole-genome sequencing results of 242 clinical isolates of culture-confirmed M. tuberculosis isolates from tuberculosis patients referred to a tertiary care hospital in Southern India. Phylogenetic analysis revealed that the isolates in our study belonged to five different lineages, with Indo-Oceanic (lineage 1, n = 122) and East-African Indian (lineage 3, n = 80) being the most prevalent. We report several mutations in genes conferring resistance to first and second line antitubercular drugs including the genes rpoB, katG, ahpC, inhA, fabG1, embB, pncA, rpsL, rrs, and gyrA. The majority of these mutations were identified in relatively high proportions in lineage 1. Our study highlights the utility of whole-genome sequencing as a potential supplemental tool to the existing genotypic and phenotypic methods, in providing expedited comprehensive surveillance of mutations that may be associated with antitubercular drug resistance as well as lineage characterization of M. tuberculosis isolates. Further larger-scale whole-genome datasets with linked minimum inhibition concentration testing are imperative for resolving the discrepancies between whole-genome sequencing and phenotypic drug sensitivity testing results and quantifying the level of the resistance associated with the mutations for optimization of antitubercular drug and precise dose selection in clinics. IMPORTANCE Studies mapping genetic heterogeneity of clinical isolates of M. tuberculosis for determining their strain lineage and drug resistance by whole-genome sequencing are limited in high tuberculosis burden settings. We carried out whole-genome sequencing of 242 M. tuberculosis isolates from drug-sensitive and drug-resistant tuberculosis patients, identified and collected as part of the TB Portals Program, to have a comprehensive insight into the genetic diversity of M. tuberculosis in Southern India. We report several genetic variations in M. tuberculosis that may confer resistance to antitubercular drugs. Further wide-scale efforts are required to fully characterize M. tuberculosis genetic diversity at a population level in high tuberculosis burden settings for providing precise tuberculosis treatment.

T uberculosis (TB) and antitubercular drug resistance are significant problems in India (1).Over recent decades, molecular drug susceptibility testing (DST) has become a cornerstone of TB management by enabling rapid diagnosis of TB infection as well as detection of antitubercular drug susceptibility with commendable specificity and sensitivity (2,3).The classical molecular diagnostic tests such as real-time PCR-based GeneXpert and line probe assays such as Genotype MTBDRplus and MTBDRsl can interrogate only a few confined parts of the Mycobacterium tuberculosis (M.tubercu losis) genome and do not give comprehensive information regarding antitubercular drug susceptibility (4).However, the advent of whole-genome sequencing (WGS) has emerged to be a cost-effective diagnostic tool for better prediction of drug susceptibility phenotype to antitubercular drugs compared to existing phenotypic and genotypic DSTs.WGS provides a comprehensive picture of genetic polymorphisms, such as single-nucleotide polymorphisms (SNPs) and small insertions and deletions (indels), within a short turnaround time.WGS also enables the identification of new drug resistance mutations as well as the detection of mutations associated with low-level resistance with unprecedented accuracy (5)(6)(7)(8)(9)(10)(11).The WGS-guided genotypic predictions of the susceptibility of M. tuberculosis to first-line drugs were found to be correlated with phenotypic susceptibility to these drugs (12,13).WGS data has also aided in delineating information regarding transmission dynamics as well as intra-and interpatient variations during outbreaks of TB infection (14)(15)(16)(17).
The complete genome sequence of M. tuberculosis was described in 1998 (18).The utility of WGS in TB diagnostics is poised to rapidly shift from a research-only perspective to routine patient care, population surveillance, and public health intervention strategies for implementing pathogen-based precision medicine treatments for TB (4,13,(19)(20)(21).Studies on the mapping of genetic heterogeneity of clinical isolates of M. tuberculosis for determining their strain lineage and drug resistance by WGS are limited in India (22)(23)(24)(25).To have a comprehensive insight into the genetic diversity of M. tuberculosis in Southern India, we carried out WGS of M. tuberculosis isolates from drug-sensitive and drug-resistant (DR) TB patients, identified and collected as part of the TB Portals Program.The TB Portals database, an open web-based platform, is a repository of linked socioeco nomic/geographic, clinical, laboratory, radiological, and genomic data from prospective and retrospective TB cases (26).
MTB/RIF (Cepheid, Sunnyvale, CA, USA).Smear or GeneXpert-positive M. tuberculosis samples were cultured by concentrating and decontaminating the sample with 2% NaOH (sodium hydroxide)-NALC (N-acetyl L-cysteine solution) and further neutralized with phosphate buffer, followed by inoculation into 7 mL MGIT liquid culture tube (Becton, Dickinson and Company, MD, USA), supplemented with oleic acid albumin dextrose catalase and a cocktail of antibiotics PANTA (lyophilized antibiotics polymyxin B, amphotericin B, nalidixic acid, trimethoprim, and azlocillin) (Becton, Dickinson and Company, MD, USA).Culture tubes were incubated in the BD BACTEC MGIT 960 (Becton, Dickinson and Company, MD, USA) system until the instrument flags them positive.After 42 days, the instrument flagged the culture tubes negative if there was no growth.For tubes with positive cultures, smears were prepared from the broth and stained with Ziehl-Neelsen staining.The smears were observed under the light microscope; non-AFB confirmed contamination which was confirmed by subculturing them on blood agar followed by one-day incubation.In cases of contamination confirmed with a negative AFB smear, cultures were discarded.If the MGIT cultures were contamina ted with positive AFB smears, they were re-cultured to get pure cultures.Bacterial cultures that were positive for AFB were further assessed with a TB Ag MPT64 rapid test (SD Bioline, Standard Diagnostics, Inc., South Korea), which uses an immunochroma tographic method to discriminate between M. tuberculosis and Mycobacterium other than tuberculosis (MOTT).

DST
Contamination-free and confirmed M. tuberculosis complex obtained from 3-to 5-day-old MGIT cultures were tested for qualitative DST by using the BD BACTEC MGIT 960 system, which gives reports within 14 days of incubation.The SIRE kit (Becton, Dickinson and Company, MD, USA), which contains lyophilized drug vials and SIRE supplement (Becton, Dickinson and Company, MD, USA) and PZA kit (Becton, Dickinson and Company, MD, USA) comprising lyophilized drug vials and PZA supplement (OADC enrichment) (Becton, Dickinson and Company, MD, USA) were used for the DST.The concentration of the drugs used were streptomycin 1.00 µg/mL, isoniazid 0.10 µg /mL, rifampicin 1.00 µg/mL, ethambutol 5.00 µg/mL, and pyrazinamide 100.0 µg/mL.Results were interpreted automatically and reported as susceptible or resistant by the system.In the case of multidrug-resistant TB (MDR TB), second-line DSTs were performed follow ing the same methodology as with first-line DSTs.Second-line drugs and their concen trations were kanamycin 2.50 µg/mL, amikacin 1.00 µg/mL, capreomycin 2.50 µg/mL, ofloxacin 2.00 µg/mL, and moxifloxacin 2.00 µg/mL (27).(All the drugs were sourced from Becton, Dickinson and Company, MD, USA.)All these drugs were reconstituted with sterile distilled water to obtain a working solution.100 µL of each working solution of these antitubercular drugs was added into 8.3 mL of the MGIT medium (7.0 mL of medium + 0.8 mL of SIRE supplement + 0.5 mL of inoculum), which gave a 1:84 dilution of the working solution.Phenotypic DSTs were not performed for other second-line antitubercular drugs.

DNA extraction
DNA was extracted for WGS from M. tuberculosis cultures using QIAampDNA mini kit (Qiagen, Hilden, Germany) following the manufacturer's protocol.The concentration and purity of extracted DNA were measured using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA) and DNA integrity was checked by 1% agarose gel electrophoresis.All laboratory work involving M. tuberculosis cultures was performed in a mycobacteriology lab under negative pressure conditions.

Library preparation and sequencing
Barcoded paired-end WGS sequencing libraries were prepared using NEBNext Ultra DNA Library preparation kit following the manufacturer's instructions.The resulting libraries were quantified before loading on the cBot for cluster generation and sequencing on an Illumina HiSeq system to generate 2 × 150 base pair (bp) sequence reads.

Bioinformatic analysis
Sequencing data were analyzed after the necessary quality control steps.A minimum of 75% of the sequenced bases were required to have a quality score of 30 (Q30) or greater.Sequenced data was processed to generate FASTQ files.Sequencing reads were checked to ensure adequate coverage of the TB reference genome.All samples included in the analysis have a sequencing depth of 10 reads or greater over at least 97% of the H37Rv genome.Genotypic drug susceptibility prediction was performed with TB-Profiler (version 4.1.1)using the Pilon variant caller option (28).To determine whole genome variants, sequenced reads were first quality trimmed using Trimmomatic (version 0.36) where the 3-prime sequence is trimmed once the average quality of bases in the 4-base window drops below 15.Trimmed reads were aligned to the M. tuberculosis H37Rv (accession NC_000962.3)reference genome using the bwa-mem algorithm as part of the BWA software package (version 0.7.17-r1188) and variants were called using Pilon (version 1.23) (29).

Phylogenetic clustering
The vcf files for the individual samples were merged into a multi-sample vcf file.This file was converted into the nexus format and used as input for maximum likelihood phylogenetic analysis with the IQ-TREE v1.6.12 software.Sample lineage information was manually added to the tree figure using Inkscape v1.0.1.Heat maps of the DST class, district, and the DR SNPs identified by TB Profiler (28) were created using the R package ggtree2.The maximum likelihood phylogenetic tree with the associated metadata heat maps is presented in Fig. 1.

RESULTS
The general characteristics of the study population are described in Table S1.The maximum likelihood phylogenetic tree with the associated metadata heat maps is presented in Fig. 1.The TB population in our study was genetically diverse as the lineage analysis revealed that the isolates in our study belonged to four different lineages with lineage 1 (n = 122) and lineage 3 (n = 80) being the most common.Other lineages included lineage 2 (n = 9), and lineage 4 (n = 22).We also identified nine cases of mixed infection as shown in Fig. 2. We also had a few mixed lineage isolates of lineage 1 and 3 (n = 5), lineage 1 and 4 (n = 2), and lineage 3 and 4 (n = 2).One isolate was a mixture of two lineage 4 sublineages as shown in Table 1.
All the mutations in rifampicin-resistant strains were confined to the rpoB gene, with six (40%) of them carrying the rpoB_p.Ser450Leu mutation.We observed significant lineage-specific variations in the proportion of isolates with rifampicin resistance-confer ring mutations as shown in Table 2. 73.3% (n = 11) of rifampicin-resistant mutations identified by WGS were from lineage 1, and 20% (n = 3) and 6.7% (n = 1) were from lineage 2 and 4, respectively.There were no rifampicin-resistant mutations identified by WGS among lineage 3 isolates.There was a concordance of 83.3% for the high-frequency rpoB_p.Ser450Leu mutation and 100% for rpoB_p.His445Asp and rpoB_p.Ser450Phe mutations with the phenotypic test as shown in Fig. 3. Isolates with the combination of low-frequency mutations such as rpoB_p.Gly442Glu, rpoB_p.Leu464Met and rpoB_p.Ser441Ala (n = 1), and rpoB_p.His445Arg with rpoB_p.Ser441Leu (n = 1) were not phenotypically resistant.Among the four isolates with rpoB_p.His445Leu mutations, phenotypic discordancy was observed in three isolates.All these four isolates belonged   The frequency and percentage of a single mutation and combinations of mutations that may confer resistance to a specific antitubercular drug present among different isolates is presented in Table 2, along with their distributions across different lineage.The mixed lineage 3,4 did not have any mutations, and hence was not included in the table.
to lineage 1, with two isolates each in lineage 1.1.2and lineage 1.2.2.2.Three of these isolates were identified to be sensitive and one resistant by GeneXpert.Phenotypic discordancy was also observed in an isolate with the rpoB_Leu430Pro mutation.Among the seven isolates identified to have rifampicin resistance mutations, but were phenotyp ically sensitive, six samples were found to be rifampicin sensitive by GeneXpert (one sample did not have a GeneXpert value available).Among isolates with both GeneXpert and WGS results (n = 203), a 95% concordance was observed for the identification of rifampicin resistance status between GeneXpert and WGS.Among 15 isolates identified to have rifampicin DR mutations, coexisting isoniazid (n = 14), pyrazinamide (n = 8), ethambutol (n = 12), streptomycin (n = 8), fluoroquinolone (n = 3), aminoglycoside (n = 1), and ethionamide (n = 4) DR mutations were observed.Among the eight samples that had DR mutations to all the four first-line drugs, isoniazid, rifampicin, ethambutol, and pyrazinamide, phenotypic resistance to rifampicin was observed in seven isolates.Among these eight isolates, an isolate with the rpoB_p.His445Leu mutation was found to be phenotypically susceptible to rifampicin.A coexisting katG_p.Ser315Thr mutation was present in four of the total six isolates with the rpoB_p.Ser450Leu mutation.Of the four samples with the rpoB_p.His445Leu mutation, two samples had a co-existing ahpC_c.-52C> T promoter mutation, and the other two had combination mutations of the promoter inhA_c.-154G> A and inhA_p.Ile194Thr.An average of 5.2 (range: 2-10) DR mutations were identified among these 15 WGS identified rifampicin-resistant isolates.katG_p.Ser315Thr was identified as the most frequent mutation among the isoniazidresistant isolates.Lineage-specific variations in the proportion of isolates with isoniazid resistance-conferring mutations were also identified, with about ~70% (n = 23) of isolates being reported as lineage 1.As seen in the DR SNP heat map in Fig. 1, almost all the DR mutations were scattered among very distantly related samples.An exception to this was a group of 11 isolates with the ahpC_p.Glu76Lys mutation.These samples formed a monophyletic clade of DR samples as shown in Fig. S1.All of the 11 TB patients who were identified to have only the ahpC_p.Glu76Lys mutation were pulmonary TB (PTB) patients and belonged to the same lineage 1 and sublineage 1.2.2.2.Ten of these patients were from the same Udupi district and one was from the Davanagere district.Ten of these cases were newly diagnosed TB patients and one patient was a case of relapse of TB (age range: 24-68).Three of these cases had a coexisting rpoB_p.Ser450Leu mutation along with other DR mutations.All six PTB patients identified to have only the ahpC_p.Glu76Lys mutation, and no other DR mutation by WGS did not show phenotypic resistance.Two isolates identified to have only promoter ahpC_c.-52C> T mutations by WGS also did not have isoniazid resistance flagged in the BACTEC report.A WGS and phenotypic concordance of 90.9% was observed among isolates having only katG_p.Ser315Thr mutation (n = 8) or katG_p.Ser315Thr mutation coexisting with either fabG1_c.-8T> G (n = 1), ahpC_p.Glu76Lys (n = 1), and ahpC_c.-52C> T (n = 1).All three isolates that had katG_p.Ser315Thr mutation along with other isoniazid mutations were phenotypi cally resistant.The WGS and phenotypic concordance of 86.6% was observed for all the katG mutations (n = 15), which included katG_p.Ser315Thr, katG_p.Ser315Asn, and katG_p.Asp419His.85.7% phenotypic concordance was observed with isolates identified to have inhA mutation (n = 7).A single isolate with a combination of ahpC_p.Glu76Lys and inhA_p.Ser94Ala showed phenotypic discordancy.Except for a sole isolate having only the inhA_p.Ile21Val mutation, all of the inhA-resistant isolates had either two inhA mutations, that is, inhA_c.-154G> A and inhA_p.Ile194Thr (n = 2), two inhA mutations (inhA_c.-154G> A, inhA_p.Ile21Thr) with ahpC_p.Glu76Lys (n = 1), a single inhA mutation (inhA_p.Ser94Ala) with ahpC_p.Glu76Lys (n = 1), or a combination of the inhA_p.Ile21Thr mutation along with fabG1_c.-15C> T, fabG1_c.-16A> G, and ahpC_p.Glu76Lys mutation (n = 2).Except for the sole isolate having only the inhA_p.Ile21Val mutation with the mixed lineage of 1 and 3, all the other six inhA-resistant isolates belonged to the lineage 1 and sublineage 1.2.2.2.Four of these inhA-resistant isolates were from Udupi district, and one each from the Davanagere and Shimoga districts.
There were eight and three isolates with mutations in the pncA and panD gene, respectively, conferring resistance to pyrazinamide.Among the 11 isolates identified to have pyrazinamide DR mutations, phenotypic resistance was observed among seven isolates.Discordancy was observed for both pncA (n = 3) and panD (n = 1) mutations.Nine isolates that were phenotypically sensitive to ethambutol were identified to have ethambutol resistance conferring mutations by WGS.Lineage 1 constituted 81.8% and 71.4% for the DR mutations for pyrazinamide and ethambutol, respectively.Among isolates identified to have embB306 mutations, nine isolates had the embB_p.Met306Val mutation and one isolate had the embB_p.Met306Ile mutation.The WGS DR for these isolates was MDR (n = 6), pre-XDR (n = 3), and isoniazid DR (n = 1).
rpsL_p.Lys43Arg (n = 5) was the most frequent mutation associated with streptomy cin resistance and had 80% concordance with phenotypic results.High discordance was particularly seen for deletions in the gid gene.Among the total of eight isolates having only gid mutations, only one isolate showed phenotypic resistance.Among these eight isolates, six isolates had only gid gene deletions, and only one isolate (gid_c.115delC)showed phenotypic resistance.Five and one isolate(s) with gyrA and gyrB mutations, respectively, rendering resistance to fluoroquinolones were also identified in our study.The phenotypic report available for two isolates showed resistance to ofloxacin.We report two patients with noncoding mutations at both 1402 (C→A) and 1484 (G→T) positions of the rrs gene that confers aminoglycoside resistance.The phenotypic DST report for aminoglycosides was not available for these two isolates.A missense mutation in the folC gene, folC_p.Glu153Gly (458A→G) among two patients as well as a promoter mutation in thyX (thyX_c.-16C> T) and a thyA deletion (thyA_c.-1099_*239del) in a patient conferring resistance to para-(PAS) were identified by WGS analysis.
Among 11 isolates identified to have MDR TB by WGS, nine isolates were lineage 1, with lineage 2 and 4 having one isolate each.Four of these isolates were from TB relapse cases.Four isolates were also identified to be from TB patients who had diabetes mellitus.Three cases identified to be pre-XDR TB by WGS had common DR mutations for rifampicin (rpoB_p.Ser450Leu), ethambutol (embB_p.Met306Val), and streptomycin (rpsL_p.Lys43Arg).

DISCUSSION
In the last decade, the number of research publications containing WGS data on M. tuberculosis has steadily increased.However, Asian and African countries are underrepre sented in WGS research on M. tuberculosis when compared to their TB burden (30).Scientists from high TB burden countries should lead WGS research on M. tuberculosis to maximize the benefits derived from the TB genomics revolution (31).Routine WGS of M. tuberculosis isolates provided added value in expediting patient-centered management of DR tuberculosis in low TB burden settings with respect to commercial genotypic assays (32).WGS could predict M. tuberculosis susceptibility to first-line antitubercular drugs more reliably than phenotypic DST.It also could accurately detect mutations associated with low-level resistance, which are commonly missed in conventional DST (10).
M. tuberculosis strains belonging to separate lineages differ by 1200 SNPs on average (33).There are seven distinct phylogenetic lineages of M. tuberculosis known as Indo-Oceanic lineage (lineage 1), East Asian lineage (lineage 2), East African-Indian lineage (lineage 3), Euro-American lineage (lineage 4), West African lineage I (lineage 5), West African lineage II (lineage 6), and recently discovered new lineage 7, which is predom inantly restricted to Ethiopia (34)(35)(36).Lineage 1 and 3 have been reported to be predominant in South India and North India TB population, respectively (22,23,34).Reports from Advani et al. revealed a prevalence of 70% of lineage 3 and 14.28% of lineage 1 among the North Indian population (23).Whereas reports of Manson et al. and Munir et al. conducted among TB patients in Southern India revealed that lineage 1 constituted 70% and 67% of the isolates, respectively, while lineage 3 represented 16% and 9% of the M. tuberculosis isolates, respectively, characterized in those studies (22,25).Our observations agree with the previous findings from India that lineage 1 is more prevalent in South India (22,25,37).Mixed infections have also been reported in TB patients, where a TB patient could be infected with more than a single M. tuberculosis strain, including strains with differing resistance profiles (38)(39)(40).We also had a few mixed lineage isolates of lineage 1 and 3, lineage 1 and 4, and lineage 3 and 4 as shown in Table 1.Our finding revealed that sublineage 1.1.2constituted most of the isolates among lineage 1.The sublineages 1.1.2and 1.2.1 are the most common, accounting for about 1.1 million cases globally (41).
Mutations at codon 450 of the rpoB gene, particularly rpoB_p.Ser450Leu has been associated with high levels of rifampicin resistance (42,43).Our findings show that all of the mutations in rifampicin-resistant strains were identified in the rpoB gene, with rpoB_p.Ser450Leu being the most frequently identified mutation.rpoB_p.Ser450Leu, rpoB_p.His445Asp, and rpoB_p.Ser450Phe could alter the structural interaction between rpoB and rifampicin and contribute to high-level rifampicin resistance.The interaction changes attributed have been the loss of hydrogen bond and steric hindrance for rpoB_p.Ser450Leu and rpoB_p.Ser450Phe mutations, and loss of hydrogen bond for rpoB_p.His445Arg and rpoB_p.His445Asp mutations (44).Higher phenotypic concord ance was observed among isolates identified to have these mutations in our study.rpoB_His445Leu and rpoB_Leu430Pro, listed as borderline resistance rpoB mutations by the WHO, accounted for one-third of our isolates with rifampicin resistance mutations and had high phenotypic discordancy.These mutations, also referred to as low-level resistance mutations have been reported to account for 12% (95% CI: 10-15%) of rifampicin resistance mutations based on WHO surveillance data from seven countries, but also could be occurring at higher frequencies (45).Rapid liquid culture systems such as the MGIT 960 system fail to detect strains with these borderline rifampicin resistance mutations (46,47).Discordancy in WGS and phenotypic DST by the MGIT 960 system (at rifampicin critical concentrations of 1.00 µg/mL) have been reported previously for the rpoB_p.His445Leu mutation and the rpoB_Leu430Pro mutation (10,(48)(49)(50).
Several rifampicin resistance mutations cause diagnostic and treatment challenges, when genotypic and phenotypic results are discordant, in the cases of low-level rifampicin resistance mutations and rare or novel rpoB mutations.In such cases, MIC testing may be a potential supplemental tool for determining their clinical significance.In addition as low-level resistance mutations in rpoB may underscore the value of genotypic methods for the diagnosis of rifampicin resistance (49).The majority of the rifampicin-resistant mutations were identified among lineage 1 isolates.Higher detection of isoniazid resistance among lineage 1 isolates with rifampicin resistance has been reported (56).We identified several isolates with rifampicin resistance mutations with coexisting DR mutations to one or more first-line antitubercular drugs, in partic ular isoniazid, implying the need for phenotypic and genotypic tests for all first-line antitubercular drugs once rifampicin resistance is suspected.
The mechanism of M. tuberculosis-mediated isoniazid resistance is highly complex due to the involvement of mutations in several genes such as katG, inhA open read ing frame (ORF), fabG and inhA regulatory region, oxyR′-ahpC intergenic region, ahpC ORF, and various other genes (57).Of these, mutations in the katG gene, the inhA regulatory regions represent the most common mutations found in isoniazid-resistant isolates (58).The mutation at the codon 315 of the katG gene accounts for the most frequent mutation, where each of the nucleotides of that codon (AGC) can be mutated to encode a threonine, asparagine, arginine, isoleucine, glycine, or leucine residue (57).Mutations in katG are associated with a wide range of moderate-to high-level isoniazid resistance.katG_p.Ser315Thr was the most frequently identified mutation among the isoniazid-resistant isolates in our study, and high phenotypic concordance was observed in isolates having katG_p.Ser315Thr mutation alone or coexisting with other isoniazid DR variants.The majority of the isolates with only katG_p.Ser315Thr mutation showed MICs ≥ 0.5 µg/mL and a small proportion of isolates exhibited MIC <0.1 µg/mL (54).Other previously reported low-frequency missense mutations such as katG_p.Ser315Asn and katG_p.Asp419His were also observed at a low frequency in our study with a phenotypic concordance of 66.7% and 100%, respectively (59).Except for a sole isolate, all isolates identified to have inhA mutation had multiple inhA DR mutations or inhA mutation(s) coexisting with ahpC and/or fabG1 (n = 6) mutations.All these six isolates belonged to the same lineage 1 and five of these isolates showed phenotypic resistance.Muta tions within the promoter region of the fabG1-inhA operon leads to overexpression of inhA, leading to low-level isoniazid resistance.Higher doses of isoniazid are required for isolates with this mutant to achieve complete inhA inhibition (57).The mutation at −15 (C→T) in the fabG1-inhA promoter region has been reported to be the most prevalent mutation in the inhA gene and is prevalent in an average of 19% of the isoniazid-resist ant clinical isolates worldwide (58).This mutation was found at low frequency (five) in the current study, with three isolates having only the fabG1_c.-15C> T mutation (two isolates showed phenotypic resistance) and two isolates having fabG1_c.-15C> T coexisting with other isoniazid DR variants (both isolates were phenotypically resistant).Our study results show that a significant number of isoniazid-resistant isolates harbored mutations in the regulatory and ORF regions of the ahpC gene as well.Few isolates with the fabG1_c.-15C> T and ahpC_c.-52C> T mutations showed MICs ≤ 0.12 µg/mL (54).
Isolates having only these noncoding mutations had lower phenotypic concordance in our study.
There were eleven samples in our study with the ahpC_Glu76Lys mutation, either alone or in combination with other DR variants (Table 2).All these samples were part of a well-supported cluster in the phylogenetic tree (Fig. 1), so they represent a singular lineage (lineage 1, sublineage 1.2.2.2) of DR TB, that is present in South India.Interest ingly, 10 out of the 11 isolates were from TB patients belonging to the same district.Six isolates that had only the ahpC_p.Glu76Lys mutation did not show phenotypic resistance, warranting the need for further research to assess the confidence/level of isoniazid resistance.While the variation present in the complete genomes of these TB samples is what defines this group as being derived from a common ancestor exclusive of the other samples, this specific DR variant (ahpC_p.Glu76Lys) is the only DR variant common to all of the samples in this cluster.The next most closely related samples in our study are a pair of highly drug-resistant samples that do not have this mutation.Isoniazid resistance is considered high when resistance is shown at a critical concentration of 0.1 µg/mL and 0.4 µg/mL, and low when resistance is shown at a critical concentration of 0.1 µg/mL but tested susceptible at a concentration of 0.4 µg/mL.Our phenotypic isoniazid DST was done only at the critical concentration of 0.1 µg/mL, which detects both low-and high-level isoniazid resistance.Hence, additional testing of isolates with 0.4 µg/mL could help in identifying isolates with high-level resistance, which may have implications for optimal isoniazid dosing (13,60).
Pyrazinamide resistance is primarily due to pncA gene mutations that could abort or reduce PZase activity (61)(62)(63).Ethambutol resistance is associated with genetic variations in the embB, particularly in embB codon 306 (64,65).We observed a low concordance between WGS and phenotypic tests for pyrazinamide and ethambutol, which have been previously reported by several studies (66)(67)(68).High prevalence of false resistance and susceptibility have been noted for MGIT-based pyrazinamide and ethambutol DST (69,70).Phenotypic DST for pyrazinamide gives false susceptible results due to the mutations having MIC close to the critical concentration, leading to specificity underestimates by genotype DST (71).One of the two isolates with a pncA_p.Thr47Ala mutation showed phenotypic susceptibility in our study.The MIC for this mutation has been reported to be 50 µg/mL in MGIT 960 (72).Five out of eight isolates with embB_p.Met306Val mutation were previously reported to be phenotypically susceptible by MGIT 960-based DST, at ethambutol critical concentrations of 5.00 µg/mL.Several other low-frequency mutations associated with ethambutol resistance, such as embB_p.Met306Ile were found to be susceptible by MGIT 960-based DST at ethambutol critical concentrations of 5.00 µg/mL (10).Isolates with embB_p.Met306Val and embB_p.Met306Ile mutations, which were susceptible at ethambutol critical concentrations of 5.00 µg/mL, showed resistance at lower concentrations of 2.5 µg/mL, implying the need for using ethambutol concentra tions lower than the critical concentration of 5.00 µg/mL for identifying mutations with a low level of resistance (10).Isolates with embB_p.Met306Val and embB_p.Met306Ile mutations were reported to have an MIC of <5 µg/mL.Two isolates with the non-coding mutation embA_c.-12C> T (phenotypically susceptible in our study) were also repor ted to have a MIC of <5 µg/mL (54,55).Several studies have demonstrated a strong association between embB306 mutations and resistance to isoniazid/rifampicin, or MDR phenotype (73,74).We also observed a high frequency of MDR TB and preXDR TB cases as identified by WGS among isolates having embB306 mutations, highlighting the need to investigate phenotypic susceptibility/DR mutations of other antitubercular drugs, once ethambutol resistance is suspected.
Mutations in the rpsL gene are associated with high streptomycin resistance, whereas rrs and gid gene mutations confer intermediate to high and low streptomycin resistance, respectively (75)(76)(77).A high proportion of phenotypic susceptibility to gidB mutations has been reported (78,79).We observed higher phenotypic concordance with the high-frequency rpsL_p.Lys43Arg mutation, but lower phenotypic concordance with gid mutations.An isolate with rpsL_p.Lys88Met mutation, which was phenotypically susceptible in our study was reported to exhibit MIC of 0.5 µg/mL (54).Several fra meshift mutations, such as gid_c.102delG,gid_c.292delA,gid_c.351delG,gid_c.87delC,and stop-gain mutations such as gid_p.Arg206* and gid_p.Glu92* were found to be phenotypically susceptible at the streptomycin critical concentration of 1.00 µg/mL.We did not identify any DR mutations concerning the second-line antitubercular drugs such as cycloserine, linezolid, bedaquiline, clofazimine, and delamanid by WGS.Phenotypic tests were not done for these drugs.
The majority of the MDR TB isolates were from lineage 1.Four of the eleven isolates were from TB patients who had diabetes as a comorbidity.Diabetes has been reported to significantly increase the odds of developing MDR TB, warranting the need to promote early diagnosis of these patients in addition to the development of robust treatment and follow-up strategies (80,81).The majority of the DR mutations were identified in relatively high proportions in lineage 1 in our study.A previous report from India showed that DR was more common among isolates in lineage 3 and 2 (82).Most of the other samples exhibiting different phenotypic and WGS-based genomic drug sensitivity statuses also have at least one DR SNP found in the minority of the genomic sequence reads.For cases where TB Profiler did not report a SNP as being present, examination of the full vcf files for these samples found very low levels of genomic sequencing reads containing the DR SNP.If these samples contained a very low-frequency DR strain, then the process of culturing the bacteria for DNA extraction could miss these genomes, while the clinical DST protocol (growth on solid or in liquid antibiotic-containing media) could register a positive DR result.
A cost comparison of different molecular diagnostic technologies for DR analysis reported higher costs and longer time to result for WGS in South Korea (83).A similar cost analysis across different settings is required for policymaking for the detection of M. tuberculosis DR mutations in clinical, programmatic, and research settings.Our study has a few limitations.Our overall sample size of 242 cases provided only limited information about DR mutations prevalent among the M. tuberculosis isolates from TB patients due to the majority of samples being from drug-sensitive cases.The lack of comparison of DR mutations by various other genotypic tests such as first-line and second-line probe assay (FL-LPA and SL-LPA) and phenotypic tests for second-line antitubercular drugs is another limitation of our study.In general, the identification of potential DR using WGS data is limited to DR variants that have been identified in the literature.Also, standard WGS analysis protocols for DR do not account for epistatic interactions among genes (84), which could contribute to genotype/phenotype mismatches.Further, we have tested only one critical concentration for each antitubercular drug in phenotypic DST.Identification of borderline or low-level resistance by phenotypic DST could be missed as low-level resistant isolates may not grow at the critical concentration and test as susceptible.The critical concentrations tested are slightly higher than the MIC for some of these isolates (10).Future studies correlating mutations that may cause antitubercular DR by WGS with phenotypic DST may employ MIC testing at different concentrations to identify and distinguish the low-level and high-level resistance mutations with implica tions for optimal dosing.Nevertheless, we report a substantial number of genetic variations in our limited number of isolates sequenced in our work that indicates that wide-scale efforts are required to fully characterize M. tuberculosis genetic diversity at a population level.Large-scale genomic data mining will be crucial for global genomic data sets, where genomic diversity information on M. tuberculosis from high burden TB settings is relatively underrepresented.The effectiveness of existing antitubercular medications as well as novel therapeutic interventions against these DR mutations will be aided by the linking of clinical information with pathogen genomic data such as found in the TB Portals database.This linked data could resolve nonconclusive genotype-phe notype correlations and aid in developing a list of highly predictive mutations for guideline preparation and consequent clinical decision-making through approaches such as machine learning.

Conclusion
WGS could be used as a potential complementary tool to the existing phenotypic and genotypic methods in providing fast, comprehensive information on mutations that may cause antitubercular DR and aid in the early characterization of lineages.Wide-scale efforts for assessment of the level of resistance of mutations identified by WGS by correlation with MIC testing are warranted to understand the phenotypic impact of the mutations on different strains and for usage as a potential point of care diagnostic and antitubercular drug/dose optimization tool.Data mining and research from such large clinical-and genomic-linked datasets could be a potential source for antitubercular DR surveillance, identification of mixed infections, aid data-sharing across borders, and implementation of precision therapy for TB.

FIG 1
FIG 1 Maximum likelihood phylogeny based on WGS data for 242 TB samples plus the H37Rv reference.Major TB lineages are indicated by branch labels.The H37Rv reference is indicated by a dot on the branch tip.Next to the tree are grids indicating drug-sensitivity test (DST) class, district of origin, and drug resistance (DR) single-nucleotide polymorphisms (SNPs).DST is indicated by color according to the legend.Black boxes indicate the district of sample origin.DR SNPs are indicated by a grayscale ranging from black (100% of WGS reads had the SNP) to white (0% of WGS had the DR SNP).

TABLE 2 TABLE 2
Summary of mutations identified by WGS for drug resistance (DR), and its proportion among different lineages a (Continued) Sl.no.Mutation and combination of mutations that may confer resistance to each antitubercular drug Total N Summary of mutations identified by WGS for drug resistance (DR), and its proportion among different lineages a (Continued) Sl.no.Mutation and combination of mutations that may confer resistance to each antitubercular drug Total N

FIG 3
FIG 3 Frequency of the mutations/combination of mutations identified by WGS that may confer resistance to specific first-line antitubercular drugs, along with their phenotypic concordance (resistance by phenotypic drug sensitivity test).S = Ser; L = Leu; H = His; G = Gly; E = Glu; M = Met; A = Ala; R = Arg; D = Asp; P = Pro; F = Phe, T = Thr; K = Lys; N = Asn; I = Ile; V = Val; Q = Gln; PZA = pyrazinamide

TABLE 1
Overall distribution of M. tuberculosis clinical isolates based on lineage subtype (n = 242) a Indo-Oceanic lineage.b East Asian lineage.c East African-Indian lineage.d Euro-American lineage.e Mixed lineage.

TABLE 2
Summary of mutations identified by WGS for drug resistance (DR), and its proportion among different lineages a