Application of next-generation sequencing to detect variants of drug-resistant Mycobacterium tuberculosis: genotype–phenotype correlation

Background Drug resistance in Mycobacterium tuberculosis (MTB) is a major health issue worldwide. Recently, next-generation sequencing (NGS) technology has begun to be used to detect resistance genes of MTB. We aimed to assess the clinical usefulness of Ion S5 NGS TB research panel for detecting MTB resistance in Korean tuberculosis patients. Methods Mycobacterium tuberculosis with various drug resistance profiles including susceptible strains (N = 36) were isolated from clinical specimens. Nucleic acids were extracted from inactivated culture medium and underwent amplicon-based NGS to detect resistance variants in eight genes (gyrA, rpoB, pncA, katG, eis, rpsL, embB, and inhA). Data from previous studies using the same panel were merged to yield pooled sensitivity and specificity values for detecting drug resistance compared to phenotype-based methods. Results The sequencing reactions were successful for all samples. A total of 24 variants were considered to be related to resistance, and 6 of them were novel. Agreement between the phenotypic and genotypic results was excellent for isoniazid, rifampicin, and ethambutol, and was poor for streptomycin, amikacin, and kanamycin. The negative predictive values were greater than 97% for all drug classes, while the positive predictive values varied (44% to 100%). There was a possibility that common mutations could not be detected owing to the low coverage. Conclusions We successfully applied NGS for genetic analysis of drug resistances in MTB, as well as for susceptible strains. We obtained lists of polymorphisms and possible polymorphisms, which could be used as a guide for future tests applying NGS in mycobacteriology laboratories. When analyzing the results of NGS, coverage analysis of each samples for each gene and benign polymorphisms not related to drug resistance should be considered.


Tuberculosis (Tb) is an infectious disease caused by
Mycobacterium tuberculosis (MTB), and is one of the most significant health issues problems worldwide. Approximately 10.4 million incident cases of Tb occurred in 2016; the majority of these were from the South-East Asian Region, followed by the African and Western Pacific Regions. Tb has caused more than 1.6 million deaths worldwide, and is the ninth leading cause of death [1]. Epidemiological control of Tb, especially drug-resistant strains, is one of the most challenging issues globally.
Various methods have been used to detect the susceptibility of MTB to various kinds of drugs, ranging from phenotyping assays to genotyping assays. For phenotyping of drug resistance, proportional methods and absolute concentration methods can be used. Phenotypic methods usually require several weeks to several months to perform, as they require culturing MTB, which is a slow-growing microbe. To overcome this issue, genotypic methods have been used to detect drug resistances in MTB. Line probe assays and the Xpert MTB/RIF assay (Cepheid, Sunnyvale, CA, USA) are representative genotyping methods for detecting drug resistance [2].
There have been several reports using next-generation sequencing (NGS) technology to reveal drug resistance profiles in MTB [3][4][5][6][7]. Previous studies usually used only MTB that was resistant to one or more drugs, without including susceptible strains. In clinical practice, however, it is highly unlikely to encounter such a situation. In other words, clinicians need to apply tests without prior information on the resistance status, and therefore need information on whether variants found in clinical specimens could be related or unrelated to resistance. In that sense, positive predictive value (PPV) and negative predictive value (NPV), as well as sensitivity and specificity, are crucial performance parameters for application of NGS in clinical practice.
This study attempted to overcome these problems. The aim of this study was (1) to apply genetic analysis using NGS for susceptible MTB strains as well as for drugresistant strains, and (2) to estimate the PPV and NPV of NGS for detection of drug resistance in clinical practice.

Clinical specimens
This study was approved by the Institutional Review Board of Dongtan Sacred Heart hospital (approval number: HDT NON2017-002). A total of 36 isolated tuberculosis strains were collected at Hallym University Medical Center in 2017. All strains were isolated from clinical specimens. Drug susceptibility patterns were identified in the Korean Institute of Tuberculosis using the absolute concentration method with Lowenstein-Jensen (LJ) medium. This method was performed using the M-kit (Multiplexing MTB drug susceptibility testing [DST] kit; Korean Institute of Tuberculosis, Osong, Korea), which can conduct DST on 16 drugs simultaneously [8,9]. The critical concentrations for each drug were as follows: isoniazid (INH) 0.2 μg/mL and 1.0 μg/mL, rifampin (RFP) 40 μg/mL, ethambutol (EMB) 2.0 μg/mL, streptomycin (SM) 10 μg/mL, amikacin (AMK) 30 μg/mL, kanamycin (KM) 30 μg/mL, ofloxacin (OFX) 4.0 μg/mL, moxifloxacin (MXF) 2.0 μg/mL, and levofloxacin (LEV) 2.0 μg/mL. The susceptibility for pyrazinamide (PZA) was determined using the pyrazinamidase activity test by Wayne's method. Strain stock adjusted to McFarland No. 1 was tenfold diluted with phosphate buffered saline. Each 25 μL was inoculated for test wells using repeating pipette. The inoculated media was incubated at 37 °C. During the 1st week, culture conditions were evaluated once daily to identify any contamination. The final interpretation was made in the 4th week to determine the culture status of the controls. Study strains were selected to represent various kinds of drug susceptibility patterns, from all-susceptible to extensively drug-resistant isolates.

Extraction of nucleic acids
Culture media containing the study strains were boiled at 100 °C for 30 min to inactivate the bacteria. After boiling, the culture media from the liquid culture bottle or colony scrape from the surface of Ogawa media were used for subsequent steps. Nucleic acids were extracted using QIAsymphony DSP Virus/Pathogen Mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.

PCR and sequencing
Target genes included in the study were as follows: katG and inhA for INH resistance, rpoB for RFP resistance, embB for EMB resistance, pncA for PZA resistance, rpsL for SM resistance, eis for KM/AMK resistance, and gyrA for fluoroquinolone (FQ) resistance. All coding regions of the eight target genes responsible for drug resistance were amplified and sequenced by the Ion AmpliSeq TB Research Panel using an IonS5 XL system (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions for use [4,7]. Information about the primers used can be obtained on the webpage provided by Thermo Fisher Scientific (https ://ampli seq. com/) or is available from the authors on request.
In brief, the libraries were made using Ion AmpliSeq Library kits (Thermo Fisher Scientific) and labeled by Ion Xpress Barcode Adapter kits (Thermo Fisher Scientific). After purification, the library was amplified using emulsion PCR in an Ion Chef instrument (Thermo Fisher Scientific) and sequenced by an ion-semiconductor sequencer.

Bioinformatic analysis
Raw sequences from the instrument were aligned to the reference genome of MTB (Mycobacterium tuberculosis H37Rv, NC_000962.3) and variants were called using the Ion Report v.5.2 software (Thermo Fisher Scientific) and CLC Genomics Workbench 11 (Qiagen). Coverage was assessed using the coverage analysis plug-ins in the applications. Variants were selected for the final analysis if they were detected in both pipelines and suspected to cause non-synonymous changes (including nonsense, missense, and frame-shift mutations) in coding regions.

Estimated predictive values using pooled sensitivity and specificity
Data from two previous studies using the same panel that we used were merged with our data to calculate the pooled sensitivity and specificity [4,7]. PPV and NPV were calculated using the prevalence of drug-resistant MTB [10].

Statistical analysis
Cohen's kappa values were calculated to estimate the agreement between the phenotypic and genotypic results. Results with kappa values greater than 0.8 were considered to be almost perfect agreement. Variants were considered to be associated with a resistance phenotype if the P value was less than 0.05 in Fisher's exact test (resistance) or if the variants were found only in resistant strains but not in susceptible strains (possible resistance). All statistical analyses were performed using MedCalc 18.6 (MedCalc Software, Ostend, Belgium).

Phenotypic resistance patterns of specimens
Drug resistance patterns determined by absolute concentration methods are displayed in Table 1. Half (n = 18) of the samples showed susceptibility to all drugs tested, while others showed resistance to one or more of the drugs.

Sequencing results
The sequencing experiments were successfully performed for all samples included in the study. Coverage at 1×, 20×, 100×, and 500× were 99.57% to 100%, 96.92% to 100%, 85.71% to 100%, and 45.24% to 96.62%, respectively. Because all samples showed more than 96.92% for 20× coverage and 100% in 27 samples, our experiments could be considered to cover virtually all target regions. Meanwhile, some samples showed low coverage for specific regions in parts of target genes, which will be discussed later.

Variant interpretation and genotype-phenotype correlation
A total of 39 variants were found in the samples. The number of variants in each sample ranged from one to ten (Table 1). Table 2 shows the list of variants found in our study and their interpretations. In total, 24 variants were determined to be related to resistance to the corresponding drugs (3 for gyrA, 2 for rpsL, 5 for embB, 3 for inhA, 3 for katG, 4 for pncA, and 3 for rpoB), and the remaining 15 variants were thought to be benign polymorphisms. Among the variants associated with resistance phenotypes, six have not yet been reported (katG L378R and Y597D; pncA S18Ter and H82Pfs; embB I419V; and rpoB R552L).
The overall agreements between genotypic and phenotypic results are shown in Table 3. For INH, RFP, and EMB, the two tests showed almost perfect agreement (Cohen's kappa of 0.824 to 1.000), while for SM, the agreement was poor (Cohen's kappa of 0.491).

Estimated predictive values using pooled sensitivity and specificity
The calculated pooled sensitivity and specificity are displayed in Table 4. Generally, NPVs were remarkably high (greater than 97%), while the PPVs were variable, ranging from 44% to 100%. The PPV for AMK/KM could not be estimated because the pooled sensitivity was 0%.

Discussion
One of the biggest issues with drug susceptibility tests for MTB is their turnaround time. Traditional methods such as absolute concentration or proportional methods usually take up to several weeks. This issue is inherent in phenotypic methods, and arises from the low growth rate of MTB. Genotypic methods have overcome this issue, with turnaround times of several days [2].
There have been many reports of using NGS for MTB drug susceptibility tests, and some of them have revealed the utility of semiconductor-based NGS [4,5,7]. Two of these reports used the same panel as in our study. However, they applied this technology to MTB isolates resistant to one or more drugs, which is unlikely to be encountered in routine clinical practice. For clinical application of NGS to drug susceptibility testing, the test should provide results in advance of phenotypic testing, without any information regarding the susceptibility. Therefore, we need to have knowledge of results from susceptible strains as well as from resistant strains. In this study, we used 18 susceptible strains to reveal the benign polymorphisms. In this study, Cohen's kappa values were calculated to assess the agreement between phenotypic and genotypic test results. Resistance patterns for INH, RFP, and EMG showed almost perfect agreement between the two methods, although the agreements were poor for other drugs. In comparison to the results from previous studies, genotypic tests showed relatively poor performance for SM (0.491 versus 0.769, 0.746) [4,7]. However, the 95% confidence interval was wide due to the small number of specimens, and we therefore cannot conclude that our results were statistically different from those of other studies.
Fisher's exact test was used for determine the clinical significance of each variant. Among 39 variants found in this study, six (gyrA c.C269T, rpsL c.A128G, embB c.A916G and c.G918A, katG c.C944G and rpoB c.C1349T) showed P values less than 0.05 and were designated as 'resistance' . 'Possible resistance' variants were  Most instances of RFP resistance come from mutations in the "rifampicin resistance-determining region" of rpoB, spanning codons 507-533 [11,12]. In contrast, the most frequent variant found in rpoB in our study was S450L (10 strains), and no variants were found in codons 507-533. This discrepancy was probably due to the relatively low coverage at these regions (< 100×) in many samples. Two main causative genes for INH resistance are katG and inhA. The most prevalent variant in this study was katG R463L, which was determined to be a benign polymorphism. Among variants associated with the resistance phenotype, the katG S315T mutation was most frequently found, which is consistent with previous reports [11,13]. However, the most prevalent mutation in the promoter region of inhA c.− 15C>T could not found because the Ion AmpliSeq panel did not cover the area [4,7,11]. This missing coverage and other genes responsible for INH resistance, such as ahpC, might be the reason that no mutations were found in three strains with resistance phenotypes.
Codon 306 in embB is the most important area for EMB resistance [14,15]. This study showed similar results: six out of seven resistant strains showed variants in codon 306. Interestingly, one strain with the M306V variant was determined to be susceptible to EMB in phenotypic assays. This discrepancy might be due to a failure in the phenotypic assay or the presence of other mechanisms interacting with the embB mutations. We could not further investigate this issue, which should be the subject of subsequent studies.
We have found two novel loss-of-function mutations in pncA that are responsible for PZA resistance. There have already been many frame-shift mutations reported in the pncA gene [16]. Pyrazinamidase, the product of the pncA gene, is required to convert PZA to its active form, pyrazinoic acid [17]. Therefore, it is reasonable to consider that frame-shift mutations or nonsense mutations in the pncA gene cause drug resistance. No causative mutations were detected in four out of eight resistant strains. This result is comparable to those of previous studies, suggesting the presence of mutations in promoter and/or other regulatory genes [17,18].
Only four mutations could be found in rpsL, and none in eis. For rpsL, the most common mutation is K43R [11], which was also found in this study. Meanwhile, the agreements between phenotypic and genotypic results were regrettably poor for these drugs, which researchers and/ or doctors should be alert to when using this panel. There   are many other genes responsible for resistance to these drugs, such as rrs, which might explain the poor performances [19]. FQ resistance arises from mutations in gyrA and/or gyrB, with codons 90 and 94 in gyrA most frequently involved [20,21]. Our data showed similar results, and all variants other than those in codons 90 and 94 were determined to be benign polymorphisms. Specifically, three variants, E21Q, S95T, and G668D, were found in almost every strain, suggesting the existence of common polymorphisms in MTB in Korea.
We estimated pooled sensitivity and specificity using combined data from this study and previous studies [4,7]. The NPVs ranged from 97.8 to 99.8%, while the PPVs were variable, from 43.8 to 100%. These data suggested that NGS can be used as a tool to rule out drug resistance before phenotypic results are available.

Conclusions
In conclusion, we successfully applied NGS to the genetic analysis of drug resistance in MTB, as well as in susceptible strains. The Ion AmpliSeq Tb Panel showed satisfactory specificity and NPV and can therefore be used for early exclusion of resistance in MTB. The panel need to be enhanced to increase the sensitivity and PPV. A useful result of this study was a list of variants with their clinical significance: six variants with confirmed resistance, 18 with possible resistance, and 15 that were susceptible. This list could be used as a guide in future applications of NGS in mycobacteriology laboratories. When analyzing NGS results, the coverage analysis of each sample for each gene, as well as benign polymorphisms not related to drug resistance, should be considered.