Structure based identification of novel inhibitors against ATP synthase of Mycobacterium tuberculosis: A combined in silico and in vitro study

a South AfricanNational Bioinformatics Institute (SANBI), SAMedical Research Council Bioinformatics Unit, University of theWestern Cape, Private Bag X17, Bellville, 7535 Cape Town, South Africa b DST-NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for TB Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa


Introduction
Effective treatment and control of Mycobacterium tuberculosis, the causative agent of tuberculosis (TB) has remained a global challenge especially in terms of drug resistance [1,2]. However, the incidence rates of multidrug-resistant TB (MDR-TB) have sharply increased, placing huge demands on TB healthcare programs [3]. Treatment options for TB have remained stagnant over the past 50 years, which has contributed to the burden of drug resistance. In order to provide an efficient therapy for MDR TB, the FDA recently approved the use of bedaquiline as a new anti-TB drug [4,5]. Bedaquiline (TMC207) is a diarylquinoline derivative and was designed to inhibit ATP synthase. ATP synthase is essential for the optimal growth of M. tuberculosis as it is a key enzyme involved in energy metabolism and is thus regarded as a promising target for anti-TB drugs [6,7]. It catalyzes the production of ATP by using the energy amassed in the form of difference in the transmembrane electrochemical potential of a coupling ion [8]. The bacterial ATP synthase comprises a membrane-bound F o segment which is composed of a 1 b 2 c 10-15 subunits as well as a hydrophilic F 1 component containing α 3 β 3 γδε subunits ( Fig. 1) [8]. The passage of proton ions across F o activates the conformational changes leading to the rotation of oligomeric subunit c, coupled with the rotation of ε as well as γ subunits within the α 3 β 3 hexamer of F 1 forces the synthesis of ATP [8].
In pathogenic bacteria, the proteins involved in energy metabolic pathways are mostly unexplored as drug targets [9][10][11]. The ATP in bacteria is produced by oxidative phosphorylation using the respiratory chain as well as by substrate-level phosphorylation of fermentable carbon sources [9,10]. The ATP synthase in these pathogenic bacteria may incorporate special functionalities which facilitate their survival in energy deficit conditions [8]. Despite the down-regulation of ATP synthase in the dormant condition, this group of enzyme can still be used as a drug target in M. tuberculosis [8,12]. In the previous study, we identified nine putative gene targets in M. tuberculosis by using genome and metabolic pathway mapping [13]. Therefore, we selected Rv1311 (epsilon (ε) subunit) (Fig. 1) which is involved in the regulation of the ATP synthase by affecting the efficiency of component coupling, along with the induction of the catalytic pathway as well as involved in the selective inhibition of ATP hydrolysis [14]. Bedaquiline is primarily designed to target the c subunit of the ATP synthase [15], but can also perform by inhibiting the activity of ε subunit [16].
The current study aimed at the identification of suitable ATP synthase ε chain inhibitors through structural analyses and pharmacophore modeling because of the bedaquiline resistance mechanisms in M. tuberculosis [17]. In the primary steps, the common feature pharmacophore models were generated for the Rv1311-ATP complex and 35 potential inhibitory molecules are sharing the respective features were identified in the ZINC database. The 35 classified drug molecules were subjected to virtual screening and the drug molecules with the highest predicted binding affinities (ZINC65375075 and ZINC11592624) were selected for further studies. Each resultant docked system was subjected to 100 ns molecular dynamics (MD) simulations in explicit water conditions, in order to validate the generated interaction patterns and for the retrieval of the lowest energy conformations. Furthermore, the respective drugs were purchased and tested using an in vitro growth inhibition assay. The generated outcomes provide an insight into the structural inhibition of Rv1311 and can facilitate the process of drug design and discovery.

Structural modeling and quality assessment
No experimentally resolved three dimensional (3-D) structures are available for Rv1311 in Protein Data Bank (PDB). Therefore, its sequence (UniProt ID -P9WPV1) was modeled using "PRIME" homology modeling module of Schrödinger [18,19]. Due to high amino acid conservation of P9WPV1 predicted using ConSurf server [20], it was selected as a model sequence for the current study. The generated 3-D structures of Rv1311 were further subjected to optimization tools for energy minimization, side chain as well as loop refinement. The stereochemical qualities of the predicted structures were assessed using the Ramachandran plots as well as on the basis of generated DOPE scores. The structures of designed ligands were constructed by means of drawing utilities present in the MAESTRO (Schrödinger Release 2018-1: Maestro, Schrödinger, LLC, New York, NY, 2018) and the geometries of the resultant ligand structures were optimized by using the Density Functional Theory (DFT) method present in JAGUAR [21]. The generated structures were subjected to further interaction studies.

Pharmacophore modeling
The potential scaffolds for Rv1311 were discovered using the pharmacophore space modeling techniques available in Discovery Studio (DS) 2016. The "Receptor -Ligand Pharmacophore Generation" module in DS was used for the generation of selective pharmacophore model from Rv1311-ATP complex. The models are generated from the interaction features present in Rv1311-ATP. The main objective of building the respective pharmacophore models is to identify the potential inhibitors that can bind strongly to the Rv1311. The best pharmacophore hypothesis was used to screen the ZINC database (version 15), which contain information about 120 million commercially available "druglike" compounds [22]. The search was set to 600 for "Best N Hits" in the section "Limit Hits" in order to obtain the best chemical entities. The generated list of molecules was further filtered using a number of criterions to achieve the final potential inhibitory molecules. The validation of the generated pharmacophore model was performed using "Decoy set" and by analyzing the interaction between chemical features and key amino acid. The decoys for each active molecule was identified using DecoyFinder in ZINC database [23].

Molecular docking
The obtained list of drug molecules was subjected to virtual screening using AutoDock Vina [24]. The drug molecules with the highest affinities to Rv1311 were re-docked using AutoDock 4 [25] package, which enables the calculation of the inhibitor constant. This software performs the prediction of the bound conformation based on free energy, which was calculated on the basis of the empirical force field and the Lamarckian Genetic Algorithm [25]. The AutoGrid module was used to create a grid box of dimensions 129 × 94 × 104 Å along the XYZ directions with a spacing of 0.375 Å. The efficiency of the predictions was amplified by setting the parameters associated with the Lamarckian genetic algorithm to the maximum efficiency values. This was achieved by setting a number of individuals in the population to 250 and the maximum number of energy evaluations to a "longer" interval. The 100 docked conformations were generated for each Rv1311 and ligand systems which were further grouped according to the RMSD tolerance of 2.0 Å. The re-scoring of the generated conformations were performed on the basis of the scoring function present in DrugScoreX server [26] and the docked conformation with the highest score was selected for the Molecular Dynamics (MD) simulations.

Molecular dynamics (MD) simulations
The obtained docked complexes were subjected to MD simulations using GROMACS 5.1.2 [27], and the topologies of the complexed structures were generated using the GROMOS96 53a6 force field [28]. The GROMACS package lack an appropriate force field parameters for drug-like molecules, therefore, the PRODRG server [29] was used for the generation of topologies and coordinate files of the ligand molecules. Furthermore, the partial charges in the generated topologies were corrected using the DFT method implemented in GAUSSIAN which utilized the B3LYP 6-31G (d,p) basis set and the CHELPG program [30]. Afterward, all the docked complexes were solvated in the SPC/E water model [31] as well as the neutralization was performed by adding the counterions. The NA and CL ions were added to neutralize the systems. Consequently, the neutralized systems were energetically minimized by steepest descent and conjugate gradient algorithms with a convergence criterion of 0.005 kcal/mol and the restraints were applied to the structure of the ligands before the equilibration phase.
The equilibration step was performed in NVT (constant volume) as well as NPT (constant pressure) ensemble conditions, each with a 100 ps time scale. The temperature of the system was maintained at 300 K using Berendsen weak coupling method in both ensemble conditions as well as the pressure, which was maintained at 1 bar by utilizing Parrinello-Rahman barostat in constant pressure ensemble. The final MD simulations were produced using the LINCS algorithm for 100 ns timescale. The generated trajectories were used to analyze the behavior of each complex in the explicit water environment. The deviations in the distances, H-bonds, RMSD (Root Mean Square Deviations), and Radius of Gyration (Rg) were analyzed between the complexed protein and ligand. Furthermore, the free energy of binding was calculated using Molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) protocols implemented in the g_mmpbsa package [32] which provide deeper insights into the interaction mechanisms of the proteins and the ligand molecules.

Drug testing
The two compounds (ZINC65375075 and ZINC11592624) with the highest binding affinities were purchased from Molport, USA and VITASMLAB, respectively. Both compounds were dissolved in Dimethyl sulfoxide (DMSO) and diluted to obtain final concentrations of 50, 25  and 10 μM in 7H9 medium. Afterwards, 100 μL of each compound was added to the appropriate wells. Positive, negative and compound controls were included on each plate. The positive control contained 100 μg/mL rifampicin and H37Rv:pCHERRY3. The negative controls included wells containing media only to identify contamination and an untreated fluorescence control to determine that DMSO used to dissolve the compounds did not affect growth. A compound control included the final concentration of each compound in the absence of M. tuberculosis to confirm the absence of autofluorescence. The background control contained M. tuberculosis H37Rv (without reporter) which is used to subtract the fluorescence measurement to account for the inherent fluorescence of H37Rv. An undiluted H37Rv:pCHERRY3 control was used to set the gain during fluorescence measurement. Plates were incubated at 37°C and readings were taken at time 0 and at 24-hour intervals thereafter for five days using the BMG Labtech POLARstar Omega plate reader (BMG Labtech, Germany, excitation: 587 nm, emission: 610 nm). The experiment was replicated three times to ensure the validity of the results.

Results and discussion
The revival of TB drug research and development is primarily driven by the urgent need for global eradication of the disease as well as the development of novel and more efficient therapies against drug-resistant strains [34]. Accordingly, a variety of therapeutic products are now in the diverse stages of clinical trials, and numerous drug development projects are in the pipelines, with varying degrees of success [34]. It is evident that the integration of the computational approaches with the experimental workflows can accelerate TB drug discovery [35].
Previous studies by our group aimed to elucidate drug resistance mechanisms present in M. tuberculosis by using genome and metabolic pathway mapping which led to the identification of nine putative targets, which could be used for the development of novel therapeutic agents [13]. In this study, Rv1311 was used for the identification of candidate therapeutic agents against TB, as it is crucial for the functioning of TB ATP synthase [8]. Therefore, the structure of Rv1311 was predicted and the pharmacophore models were generated. The database searching using these pharmacophore hypotheses leads to the identification of 35 respective inhibitory features carrying agents. These molecules were subjected to the virtual screening against the ligand binding site of Rv1311, and the top two inhibitors with highest binding affinities were selected for the further studies which are discussed in the following sections:

Assessments of generated structures
In the primary analyses, the homologous templates were identified in Protein Data Bank (https://www.rcsb.org/, PDB) using BLAST searching algorithms. The database search identified epsilon subunit and ATP complex of F1F0-ATP synthase from the thermophilic Bacillus PS3 (PDB ID -2E5Y) with a sequence identity of 37% as well as epsilon subunit of F1F0-ATP Synthase from Escherichia coli (PDB ID -1AQT, identity 26%) as the possible template. The 3D coordinates of Rv1311 were predicted by satisfying the spatial restraints using MODELLER [36]. The predicted structure of Rv1311 contains two domains, an Nterminal β sandwich (containing eight β strands) as well as Cterminal α helices domain (Fig. 2). All the β strands were arranged in an anti-parallel manner in the β sheet, while the hydrophobic space between the two α helices forms an alanine zipper resembling structure (Fig. 2). The stereochemical validation using PROCHECK showed the presence of 95.2% residues in the allowed region of the Ramachandran plot while 4.8% were occupied in the disallowed regions. The Rv1311 model was categorized as "Pass" in both Verify 3D [37] and WHATCHECK (https://swift.cmbi.umcn.nl/gv/whatcheck/) servers. The ProQ server [38] was used for predicting the quality of the generated Rv1311 model, computed the LGscore of 3.542 and MaxSub value of 0.192 and classified it in "very good model" category. The quality of the predicted 3D model was further verified using ProSA server [39]  which calculated the Z-score of −4.33, was in the range of native proteins of similar size. In order to improve the quality of the generated models, they were further subjected to the side chain optimization and for the steric clashes refinements. The RMSD values of 1.453 Å and 1.013 Å were calculated after superimposition of Rv1311 with templates 1AQT and 2E5Y respectively. These observations indicated the reliability of the predicted model; therefore, further analyses were performed.

Pharmacophore feature analyses
The structure of Rv1311-ATP complex was predicted and the receptor-ligand pharmacophore generation modules lead to the development of 10 hypotheses using the structure of Rv1311-ATP complex. Around 70 features were identified in the ligand ATP, while six features were predicted to match the Rv1311-ATP interactions (Fig. 3). The pharmacophore_01 was selected because the selectivity score was calculated to be around 11.098 for the respective model. The five features were observed in this model, namely, two hydrogen bond acceptors, two hydrogen bond donors and one negative ionizable feature (Fig. 3). The hydrogen bonds within a distance of 3.0 Å between the Rv1311 and ATP were identified and retained in the generated model.
Moreover, the hydrophobic features on the ATP within 5.5 Å of the centroid of hydrophobic residues were included, while for the prediction of positive or negative ionizable features, the charge interactions within 5.6 Å were involved in the hypothesis. Therefore, the generated hypothesis was used for searching suitable inhibitors in the ZINC database. The 35 respective features containing potential inhibitory molecules were identified, which are subjected to further screening.
The validation of the generated pharmacophore model was performed using the decoy set. Around 24 active inhibitors of ATP synthase was collected from the literature [40][41][42] and a set of 864 decoys were identified in ZINC database using DecoyFinder. The active molecules and the decoys were mixed to form a single set of molecules, which were screened using the generated pharmacophore model. It was observed that the model differentiate the actives and inactive molecules with an accuracy of 93.13%. The true positive (TP), true negative (TN), false positive (FP) and false negative (FN) values were found to be 23, 804, 60 and 1 respectively. The true positive rates and false positive rates was used for calculating ROC curve (Fig. S1). The area under curve (AUC) and BEDROC values were calculated to be around 0.942 and 0.656. These generated parameters indicated the reliability of pharmacophore model which classify the active molecules with high accuracy. Furthermore, the interactions between chemical features and  key amino acids were analyzed and the Glu85 which may form the active site of the Rv1311 was included in the Hydrogen bond donor feature of the pharmacophore model (Fig. 3).

Structure activity relationship (SAR) studies
In order to evaluate the inhibitory efficiency of the classified inhibitors, a 3D Quantitative structure-activity relationship (3D-QSAR) was performed. The structures and activity information of ATP synthase inhibitors was collected from the literature [40][41][42]. The structures of inhibitors were constructed using the drawing utilities of Discovery Studio (DS) and then minimized using the optimization modules. The structures of the ligands were subjected to "Prepare Ligands for QSAR" module which prepare the dataset for further steps. The Minimum Inhibitory Concentration (MIC) was selected as the dependent variable and the dataset was divided into the "Training" and "Test" sets. Consequently, the 3D-QSAR model was constructed using "Create 3D QSAR model" module of the DS, with 5 fold cross validations. The R 2 and Q 2 values for the generated model was observed to be 0.996 and 0.612 respectively, indicating the reliability of the predicted model. The activities of compounds 27 and 35 were predicted to be around 0.147 μg/mL and 0.130 μg/mL respectively, which was falling in the range of 0.138-0.217 μg/mL calculated for known ATP synthase inhibitors using generated 3D-QSAR model. These observations indicated that these studied molecules have potential to inhibit the functionalities of ATP synthase.

Rv1311 interaction profiles
The 35 classified drug compounds were subjected to virtual screening using AutoDock Vina. The top two derivatives with highest binding affinities were selected for further studies, and bedaquiline was used as a control. The compounds were re-docked using AutoDock suite in order to validate the generated outcomes. The information about the active site of Rv1311 was deduced by comparing with the structure of the template (PDB ID -2E5Y). Similar to other bacterial ε subunit, Asp91 (corresponding to Asp89 of the template) may form a major portion of the ATP binding motif and may be involved in recognition of adenine ring of ATP [43]. The Asp has completely conserved in the respective motif and its side chain is responsible for the fitting of ATP in the correct orientation by providing an electrostatic repulsion against triphosphate groups [43]. In addition, the Glu85 (corresponding to Glu83) may play a significant role and may be involved in the formation of adenine pocket, due to the presence of long side chain [43]. Furthermore, the Glu residue can also be involved in the stabilization of ribose by providing the hydrogen bonding and is responsible for increasing the binding affinity of the ATP [43]. Glu85 may also provide stabilization to the orientation of the C-terminal domain (CTD) relative to the N-terminal domain (NTD) through the hydrogen bonding [43].
Therefore, the grid center was focused on Glu85 and Asp91. For Rv1311_27 (docked system of Rv1311 protein and ZINC65375075) complex, the residues Glu85, ILE 90, Ala94, and Ala95 was observed in the binding pocket ( Fig. 4A) with the free energy of binding of −4.41 kcal/mol and inhibition constant was predicted to be around 581.56 μM (Table 1). Moreover, in complex Rv1311_35 (Docked system  of Rv1311 protein and ZINC11592624), the residues Asp91 was obtained along with Ala94 and Arg113 in the binding pocket (Fig. 4B), the free energy of binding and inhibition constant was calculated to be around −5.73 kcal/mol and 832.95 μM respectively (Table 1). In addition, the performances of the studied inhibitors were validated through comparison of the obtained parameters for Rv1311_Beda complex (Docked system of Rv1311 protein and bedaquiline), which showed Glu85, Glu89, Asp91 and Ala94 (Fig. 4C) in the interaction pocket with the free energy of binding of −4.14 kcal/mol and inhibition constant value of 930.44 μM. These observations indicated that the compound 35 is inhibiting comparably to bedaquiline as indicated by the generated parameters (Table 1). Moreover, a comparison of the interaction affinities was performed between the ATP and the studied compounds. The ATP showed free energy of binding energy of −3.7 kcal/mol, which is comparative lower than the studied inhibitors (Table 1). Therefore, the inhibitors may compete with the ATP during the course of inhibition. All three docked systems were subjected to Molecular Dynamics (MD) simulations for further analyses, which are discussed in the subsequent section.

Analyses of conformational dynamics' patterns
The conformational behaviors of the obtained docked complexes were analyzed in the explicit water conditions using the GROMACS software package, which provide a deeper insight into the dynamics of structural and energetic changes. All the systems were neutralized, minimized and well equilibrated, then subjected to 100 ns MD simulations. In all the studied complexes, NA ions ranges from 20-30 to 13-22 CL ions were added to neutralized the systems. Consequently, they are subjected to the subsequent stages of MD simulations. Then generated trajectories then analyzed using the utilities of the GROMACS. The changes in the pattern of the hydrogen bonding between the Rv1311 and the studied inhibitors were analyzed using "gmx hbond" module. In the Rv1311_27 complex, a maximum of five hydrogen bonds was observed in the system (Fig. 5A). While the highest number of eight Hbonds was obtained in the Rv1311_35 complex as a comparison with Rv1311_Beda which shows the presence of four bonds ( Fig. 5B and C). Furthermore, the "gmx pairdist" was used for the calculation of distances between Rv1311 and the concerned inhibitors. The average distance between Rv1311 and inhibitors 27, 35 as well bedaquiline was observed to be around 0.260 nm, 0.292 nm, and 0.247 nm respectively. But in complex Rv1311_27, very high fluctuations in the distance values were observed between 20 ns -25 ns, indicating the unstable nature of the complex, while the rest of the systems are showing relatively stable nature of binding (Fig. 6A). Moreover, the fluctuations in the constituent residues were compared for each system (Fig. 6B). The residues of complex Rv1311_27 showed relatively higher fluctuations as compared to Rv1311_35 and Rv1311_Beda, particularly in the region between the residues Glu85 and Asp91 (Fig. 6B).
Consequently, the stability and compactness of the resulted complex systems were assessed using generated RMSD and Radius of Gyration (Rg) values. In complex Rv1311_27, relatively higher fluctuations in the RMSD values were observed between 0.3 and 0.8 nm throughout the course of 100 ns MD simulations as compared to the rest of the studied systems, indicating the persistent instability in the system (Fig. 7A). While stable comparative nature was observed in Rv1311_35 and Rv1311_Beda systems, in which fluctuations were observed around 0.3 nm (Fig. 7A). Similar inferences were drawn from Rg plots, in which consistent variations in the Rg curve were observed for Rv1311_27 with the values ranges from 1.4 nm −1.5 nm (Fig. 7B). The Rv1311_Beda showed least values in the Rg plots with fluctuations were observed around 1.4 nm, but Rg values for Rv1311_35 and Rv1311_Beda showed convergence after 90 ns time (Fig. 7B). These observations indicated that both the systems were becoming relatively stable by achieving the comparable compactness in their structural topology.
The Molecular Mechanics Poisson-Boltzmann Surface Area (MM/ PBSA) methods are widely used techniques for estimating proteinligand binding affinities in the explicit solvent conditions because of their efficiency as well as high correlation with experiment [44]. These predictions based on the calculation of binding affinities are essential in providing a solution for a wide range of biophysical challenges involved in the analyses of protein-protein interactions and structurebased drug design [45]. Therefore, the g_mmpbsa module was used for the calculation of the binding affinities between the Rv1311 and the concerned inhibitors. In the Rv1311_27 complex, the electrostatic energy is primarily contributing to the total energy of the system (Fig. 8A), which showed high fluctuations around −500 kJ/mol. While, in Rv1311_35 complex, the total energy was fluctuating in the range of −100 kJ/mol to -400 kJ/mol, with binding affinities were contributed by both Van der Waals and electrostatic energies (Fig. 8B). Furthermore, the highest binding affinities of around −1000 kJ/mol were observed in Rv1311_Beda complex as compared to the other systems, indicating the tightly binding nature of bedaquiline (Fig. 8C). The output generated after MD simulations showed that Rv1311_35 is comparably stable in comparison with Rv1311_Beda, in order to improve the binding affinities several modifications were required in the side of compound 35. The average values of generated energies for each system are listed in Table 2.

In vitro growth assays
To assess the in silico results, we performed in vitro assessment of two of the available compounds in a whole cell assay. We purchased two of the available top binding compounds (ZINC65375075 and ZINC11592624) based on binding affinity scores and interaction analysis from vendors. As a preliminary test, we aimed to identify inhibitors with a Minimum Inhibitory Concentration of at least 50 μM or better. Of the two compounds, ZINC65375075 (compound 27) was found to inhibit M. tuberculosis growth at 50 μM, 25 μM and slightly at 10 μM concentrations (Fig. 9A). Whereas, compound 35 showed slight inhibition at 50 μM with bacteria sustain its growth in the rest of the concentration (Fig. 9B). The limited performance of these selected drugs may be attributed to the restricted cell permeability. In order to understand the variation of M. tuberculosis growth inhibition, the rifampicin control was added to the assay (Fig. 9). These findings suggest that the in silico screening method has merit in identifying novel anti-TB compounds and will help in making drug discovery more affordable. Future work will include in silico modifications of compound ZINC11592624 to improve binding affinity and virtual screening for isomers of ZINC11592624 that are commercially available to test in vitro.

Conclusions
In this study, the Rv1311 (ε subunit) was selected as a drug target because it is essential for the functioning of the ATP synthase and thus crucial for the energy generation in M. tuberculosis. The structure of Rv1311 was predicted and the pharmacophore modeling performed enabled the identification of 35 novel inhibitors in the ZINC database. These inhibitors were subjected to virtual screening against Rv1311 and subsequently, to a variety of MD simulation techniques. This leads to the identification of derivatives with the highest binding affinities against Rv1311. Bedaquiline was used as a control which provides the deeper insight into the inhibition of Rv1311 and the most suitable docking complex was selected through comparison with the bedaquiline complex. Our in silico finding were validated using in vitro experiments which showed ZINC11592624 is responsible for the reduction of M. tuberculosis growth in culture medium. This study will facilitate the development of novel inhibitors against the ATP synthase of M. tuberculosis and can also help in the development of therapies against drug-resistant strains.

Declaration of Competing Interests
Authors declare no competing interests.  Initiative of the Department of Science and Technology and National Research Foundation (NRF) of South Africa, award number UID 64751. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NRF.