Bedaquiline and clofazimine resistance in Mycobacterium tuberculosis: an in-vitro and in-silico data analysis

Summary Background Bedaquiline is a core drug for the treatment of multidrug-resistant tuberculosis; however, the understanding of resistance mechanisms is poor, which is hampering rapid molecular diagnostics. Some bedaquiline-resistant mutants are also cross-resistant to clofazimine. To decipher bedaquiline and clofazimine resistance determinants, we combined experimental evolution, protein modelling, genome sequencing, and phenotypic data. Methods For this in-vitro and in-silico data analysis, we used a novel in-vitro evolutionary model using subinhibitory drug concentrations to select bedaquiline-resistant and clofazimine-resistant mutants. We determined bedaquiline and clofazimine minimum inhibitory concentrations and did Illumina and PacBio sequencing to characterise selected mutants and establish a mutation catalogue. This catalogue also includes phenotypic and genotypic data of a global collection of more than 14 000 clinical Mycobacterium tuberculosis complex isolates, and publicly available data. We investigated variants implicated in bedaquiline resistance by protein modelling and dynamic simulations. Findings We discerned 265 genomic variants implicated in bedaquiline resistance, with 250 (94%) variants affecting the transcriptional repressor (Rv0678) of the MmpS5–MmpL5 efflux system. We identified 40 new variants in vitro, and a new bedaquiline resistance mechanism caused by a large-scale genomic rearrangement. Additionally, we identified in vitro 15 (7%) of 208 mutations found in clinical bedaquiline-resistant isolates. From our in-vitro work, we detected 14 (16%) of 88 mutations so far identified as being associated with clofazimine resistance and also seen in clinically resistant strains, and catalogued 35 new mutations. Structural modelling of Rv0678 showed four major mechanisms of bedaquiline resistance: impaired DNA binding, reduction in protein stability, disruption of protein dimerisation, and alteration in affinity for its fatty acid ligand. Interpretation Our findings advance the understanding of drug resistance mechanisms in M tuberculosis complex strains. We have established an extended mutation catalogue, comprising variants implicated in resistance and susceptibility to bedaquiline and clofazimine. Our data emphasise that genotypic testing can delineate clinical isolates with borderline phenotypes, which is essential for the design of effective treatments. Funding Leibniz ScienceCampus Evolutionary Medicine of the Lung, Deutsche Forschungsgemeinschaft, Research Training Group 2501 TransEvo, Rhodes Trust, Stanford University Medical Scientist Training Program, National Institute for Health and Care Research Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Bill & Melinda Gates Foundation, Wellcome Trust, and Marie Skłodowska-Curie Actions.


In vitro evolution experiments 27
In vitro experiments were carried out with H37Rv strain (ATCC 27294). Bedaquiline was purchased from 28 Janssen-Cilag GmbH and clofazimine was ordered from Sigma (C8895-1G) both were reconstituted from 29 powder in DMSO and stored at - 20°C. 30 Evolutionary experiments were conducted with the Mycobacterium tuberculosis complex (Mtbc) lab strain 31 H37Rv. The bacteria were cultured from frozen stocks and (pre)cultured at 37° in 7H9 medium, 32 supplemented with 0•2% glycerol, 10% oleic albumin dextrose catalase (OADC), and 0.05% Tween 80 33 (termed culture medium). At exponential growth phase (between 0•3 and 0•6 optical density (OD) of 34 600nm), bacteria were transferred to new culture medium with a final OD of 0.05 in 50 mL, additionally 35 supplemented with either bedaquiline (BDQ) or clofazimine (CFZ). Four concentrations below the 36 minimum inhibitory concentration (MIC) were included for each drug at 1:2, 1:4, 1:8, and 1:16 below the 37 strain MIC (and an antibiotic-free control). The MIC of the susceptible WT ancestor was determined to be 38 0•12 mg/L BDQ and 0•5 mg/L CFZ in this culturing system. 39 These experiments were carried out over 20 days with five culture passages total. Culture passages were 40 conducted every four days with a final culture OD of 0•05 transferred. After passages 1, 3, and 5; bacteria 41 were plated on selective agar plates of 7H11, supplemented with 0•5% glycerol, 10% OADC and either 42 0•12 (1:2 MIC) or 0•25 (MIC) mg/L of BDQ, or 0•25 (1:2 MIC) mg/L CFZ. After 21-28 days of growth on 43 selective agar plates, single colonies were transferred to culture medium. Single colonies were selected 44 from multiple plates of variable sizes and morphologies then grown for 2-4 weeks in antibiotic free 7H9 45 broth. Two 1 mL aliquots were frozen at -80 for storage, and an additional bacterial sample was taken for 46 whole genome sequencing (WGS). All remaining colonies (collected by experimental condition) were 47 scaped from agar plates using sterile loops, and collected for WGS, i.e. total population sequencing. This 48 population sequencing aimed to remove the bias of single colony selection and give an overview of all 49 variants in the total mutant populations throughout resistance-associated genes. 50

Whole genome sequencing 51
Isolates from in vitro evolutionary experiments underwent DNA isolation by CTAB method. 1 Paired-end 52 DNA library preparations and sequencing was performed with Illumina technology (Nextera-XT and 53 NextSeq500) according to the manufacturer's instructions and with a minimum average genome coverage 54 of 50x. Fastq files (raw sequencing data) were mapped to the M. tuberculosis H37Rv reference genome 55 (GenBank ID: NC_000962.3) using the MTBseq pipeline. 2 56 Briefly, for the analysis of single mutants we considered single nucleotide polymorphisms (SNPs) which 57 were covered by a minimum of four reads in both forward and reverse orientation, four reads calling the 58 allele with at least a phred score of 20, and an allele frequency of 75%. For the detection of insertions and 59 deletion (InDels) we allowed a frequency over 50%. 60 Deep sequencing was conducted for population diversity analysis with an average genome wide coverage 61 of 200 -420; SNPs and InDels were called with at least one forward and one reverse read and a phred 62 score >20 for at least 2 reads. All sequences were submitted to the European Nucleotide Read archive, 63 under project accession number ERX5619327. 64 Three isolates underwent sequencing using the PacBio Sequel II System (Pacific Biosciences). Libraries 65 were prepared with the SMRTbell® Express Template Prep Kit 2•0 according to the manufacturer's 66 instructions. Barcoded overhang adapters for multiplexing were ordered at IDT (Integrated DNA 67 Technologies). During demultiplexing barcodes were filtered for a minimum quality of 50t, which yielded 68 long read sequencing data of an average of 4•4 GB and mean subread length of 9 KB. Long read sequences 69 were de novo assembled using PacBio SMRT® Link software version 9 and the "Microbial Assembly" 70 workflow with a set genome size of 4•5 GB with default parameters. 71

Screening of Rv0678 mutations in clinical samples via the CRyPTIC strain collection 72
Patient derived Mtbc isolates were collected by CRyPTIC partners throughout 27 countries and analyzed 73 in 14 different laboratories. Strains were selected from the CRyPTIC database for this study if they had 74 matched genotype and phenotype data, a high quality BDQ phenotype, and a mutation in BDQ resistance 75 associated gene (either atpE, Rv1979c, pepQ, mmpL/S5, or Rv0678). This led to the curation of 179 strains 76 in total. 77 The full analysis pipeline for CRyPTIC was performed using the Clockwork v0.8.3 pipeline, 3,4 but we outline 78 the key steps here. Sequencing reads were deposited at the European Bioinformatics Institute and run 79 through a bespoke bioinformatics pipeline (publicly available here: https://github.com/iqbal-lab-80 org/clockwork). In short, reads were filtered against human and other microbial species reads before 81 being mapped to the H37Rv reference genome. Two parallel variant callers (SamTools and Cortex) were 82 used, 5,6 one of which makes high sensitivity SNP calls (SamTools) and one which makes high specificity 83 SNP and indel calls (Cortex). A graph-based adjudication tool (Minos, https://github.com/iqbal-lab-84 org/minos) was then used to combine these results and create a final set of variants for downstream 85 bioinformatics analysis. All strains were re-genotyped at positions that were variant in at least one 86 CRyPTIC sample, creating a final variant call file with a call for each variant position in the M. tuberculosis 87 H37Rv genome. 88

Phenotyping 89
In vitro single selected mutants were further analyzed for phenotypic drug susceptibility to both BDQ and 90 CFZ by broth microtiter dilution as a resazurin assay. All mutants were grown in antibiotic free culture 91 media to exponential growth phase (optical density of 0 CRyPTIC strains were phenotyped using either the UKMYC5 plate or the updated UKMYC6 plate. 7 Plates 107 were sealed, incubated at 37°C, and read at 14 days. In addition to manual plate readings, all plate images 108 underwent an automated reading using AMyGDA software. 8 Plates without essential agreement for a drug 109 MIC were marked as low quality and sent to a citizen science project (BashTheBug, http://bashthebug.net) 110 for additional verification. Plates with exact agreement between at least two phenotyping methods were 111 marked as high quality. Low (no method agrees) and medium (two methods with essential agreement) 112 quality phenotypes were excluded from this analysis. 113 All strains included in this study had at least phenotypic data for BDQ. Some strains do not include CFZ 114 data due to missing information, low quality analysis, or removal due to experimental error. 115

Phenotypic interpretation 116
Phenotypic interpretation of MIC was determined based on testing method as follows: 117 criteria included the key "TB", "Mycobacterium tuberculosis", "MTB", "bedaquiline", "clofazimine", 142 "treatment", "clinical report", "patient", "MDR-TB", "XDR-TB", "diarylquinoline", and "drug resistance". Statistical testing of structural features was performed in R, using the Fisher's exact and Wilcoxon rank-157 sum test on data listed in appendix 2 pp5. SNAP2 scores use machine learning to predict the effect of an 158 amino acid change has on the function of the protein, scores generated for Rv0678 included in appendix 159 2 pp6 (https://rostlab.org/services/snap2web/). Finally, significant changes in mCSM stability 160 measurements were categorically defined as an absolute change >=1 kcal/mol for the purposes of 161 statistical testing. 162

Molecular dynamics simulations 163
The mutated system for these simulations were chosen in order to evaluate the resistant and susceptible 164 phenotype effect on BDQ and CFZ with mutations in the same codon, in order to evaluate the residue 165 type mutation, i.e. affecting the hinge region (not protein folding of DNA binding). All the systems 166 simulated in the present work, including Rv0678-WT, Rv0678-A101E, Rv0678-L40V and Rv0678-L40F, 167 were prepared and simulated using BiKi Life Sciences Software Suite version 1•3•5 of BiKi Technologies 168 s.r.l. 16 Each simulated system consisted of Rv0678 homodimer unit X-ray structure (PDB ID 4NB5) and the 169 mutations were generated using UCSF Chimera software. 13