A fragment-based approach to assess the ligandability of ArgB, ArgC, ArgD and ArgF in the L-arginine biosynthetic pathway of Mycobacterium tuberculosis

Graphical abstract


Introduction
Despite the availability of effective chemotherapy, tuberculosis (TB) remains a leading infectious cause of morbidity and mortality worldwide. In 2019, an estimated 1.2 million deaths were caused by TB, and an additional 208,000 were a result of HIV-TB coinfection [1]. While the existing multidrug treatment regimen has a success rate of 85% in drug-sensitive TB cases (in the 2018 cohort), drug toxicity, a long treatment duration leading to patient non-compliance, as well as incompatibility with antiretroviral therapy all compromise its effectiveness. Alarmingly, the emergence of multi-drug resistant (MDR) and extensively-drug-resistant (XDR) strains of Mycobacterium tuberculosis has further undermined the efficacy of current antitubercular therapy: only 57% of MDR cases were successfully treated worldwide in the 2017 cohort. New antitubercular agents are therefore urgently required and novel chemical scaffolds and mechanisms of action must be identified that can shorten therapy and circumvent development of drug resistance. While many drugs currently in clinical use are bacteriocidal, persister populations of metabolically heterogenous bacilli can form in the granuloma microenvironments and survive treatment due to acquired phenotypic drug resistance [2]. Eliminating this persister reservoir will be crucial for shortening treatment duration and preventing latent forms/relapse of the disease. Future drug discovery campaigns against M. tuberculosis could potentially address this by pursuing targets that lead to rapid sterilisation of the bacilli inside the host. M. tuberculosis, like Mycobacterium leprae, has retained its ability to synthesise all 20 amino acids and most vitamins. The retention of these biosynthetic genes reflects an evolutionary pressure, suggesting that the pathogenic mycobacteria have ''chosen" not to derive amino acids or many vitamins from the host and lead what has thus been described as an autarkic lifestyle [3]. However, not all amino acid auxotrophies behave the same. Several amino acid auxotrophs have been found to have attenuated virulence inside host organisms, suggesting that while enzymes in amino acid biosynthetic pathways are essential in vitro, the pathogen can scavenge amino acids (albeit insufficiently) inside the host and survive [4][5][6][7][8][9][10]. However, it was shown that methionine and arginine auxotrophs of M. tuberculosis are rapidly sterilised in both immunocompetent and immunodeficient (SCID) mice without the appearance of suppressor/bypass mutants [ (3,4]. Despite the presence of two arginine transporters in M. tuberculosis [11,12] and sufficiently high serum concentrations of arginine in the host [13], the virulence of DargB or DargF mutants is entirely abolished as arginine deprivation results in extensive oxidative damage [4]. The case for drug discovery approaches to target arginine biosynthetic enzymes is further bolstered by work demonstrating that chemical inhibition of ArgJ with Pranlukast, a cysteinyl leukotriene receptor-1 antagonist use to treat asthmatic exacerbations, cleared a chronic M. tuberculosis infection in BALB/c mice [14]. The arginine biosynthesis pathway consists of eight different enzymes (Fig. 1A) all considered to be essential for M. tuberculosis growth in vitro [15]. Except for argA which encodes the first enzyme of the pathway, all other genes are present in a single operon that also includes the repressor argR (Fig. 1B).
Fragment based drug discovery (FBDD) is now an established lead-generation strategy in both industry and academia, having yielded over 30 compounds in clinical trials, including approved cancer drugs like vemurafenib, Kisqali, Balversa and venetoclax [16]. This approach consists of screening a library of small molecules (150-300 Da) against a target of interest using biophysical, biochemical and structural biology methods. The low complexity of fragments allows for efficient exploration of the chemical space of the target, often revealing unexpected binding sites in proteins. Although fragments often bind weakly, they tend to bind to hotspot regions of the target, forming well defined interactions that allow subsequent elaboration into larger drug-like molecules [17,18]. Our group and a few others have pioneered using this approach against different mycobacterial species and different protein targets with varying degrees of success [18][19][20][21][22][23][24][25][26].
Using this approach, we have screened four enzymes of the arginine biosynthesis pathway that have not been explored by drug discovery programmes: ArgB, ArgC, ArgD and ArgF. Herein we report the structures of the four enzymes in complex with fragment hits, including a novel allosteric site of ArgB and allosteric inhibitors of this enzyme. Importantly, this work also assesses the potential of these enzymes as candidates for future drug discovery programmes.

Molecular cloning
The argB gene was amplified from chromosomal DNA of M. tuberculosis H37Rv strain obtained from ATCC (ATCC25618D-2) while the ORFs of argC, argD and argF were purchased as E. coli codon-optimised synthetic gene strings through the ThermoFisher GeneArt Gene Synthesis service. The argB gene was cloned into pHAT4 [27] using NcoI and XhoI sites. The gene strings of argC, argD and argF were cloned into a pET28a vector (modified to include an N-terminal 6xhis SUMO) [28] using BamHI and HindIII restriction sites. All constructs were confirmed by sequencing.

Protein expression and purification
250 ml of autoclaved 2xYT broth (Formedium) prepared in distilled water, containing 100 lg/mL ampicillin for pHAT4:argB or 30 lg/mL kanamycin for pET28a:argC/argD/argF, was inoculated with E. coli BL21(DE3) containing the respective expression construct, and incubated at 37°C with 220 rpm shaking overnight. This primary culture was used the following day to inoculate 6 flasks containing 1 L 2xYT broth and the appropriate antibiotic, and the inoculated flasks were incubated under similar conditions until the OD 600nm reached 0.8-1. Overexpression was induced by the addition of 0.5 mM isopropyl b-D-1-thiogalactopyranoside (IPTG). Thereafter, the flasks were incubated at 20°C with 220 rpm shaking overnight. Cells were harvested by centrifugation at 4200 rpm, 4°C for 20 min in a Beckman Coulter ultracentrifuge. The cell pellets were re-suspended in 50 ml of Buffer A (Table S1), also containing 1 tablet of cOmplete TM , Mini, EDTA-free Protease Inhibitor Cocktail (Roche, Merck), DNase I (Sigma-Aldrich) and 5 mM MgCl 2 . The cells were lysed by ultrasonication for~8 min (pulse on for 20 secs, pulse off for 30 secs, 55% amplitude), the suspensions were kept in an ice bath throughout. The cell lysates were clarified by centrifugation (27000 g, 4°C for 40 min), and the supernatants were syringe-filtered (0.45 lm membrane) to remove any cell debris.
The filtered lysates were subjected to IMAC using a His-Trap 5 ml Nickel column (GE Healthcare Life Sciences) on an ÄKTA Pure system (GE Healthcare Life Sciences), equilibrated with buffer A (Table S1). Isocratic elution was performed using buffer B (Table S1). Proteins were dialysed in buffer C (Table S1) at 4°C and tags were cleaved overnight by adding TEV protease for ArgB or Ulp1 protease, for ArgC, ArgD and ArgF, both at 1:100 ratio.
The dialysed proteins were concentrated to a <5 ml volume using a 30 kDa MWCO Vivaspin 20 centrifugal concentrator (Sartorius) at 5000 g, 4°C and injected onto a HiLoad 16/600 Superdex 200 gel filtration column (GE Life Sciences) equilibrated with buffer C (Table S1). Elution fractions corresponding to the peak of interest in the chromatogram were pooled and fraction purity was assessed by SDS-PAGE. The purest fractions of ArgB and ArgF were pooled and concentrated to 20 mg.ml À1 . Pooled fractions of ArgC were further dialysed into the final storage buffer (5 mM Tris-HCl pH 7.4, 50 mM NaCl) overnight at 4°C, rescued the next day and concentrated to 6.5 mg/mL. ArgD fractions were pooled and aqueous pyridoxal-5 0 -phosphate (PLP, Sigma-Aldrich) was added (2 mM final PLP concentration). Overnight dialysis into the storage buffer (50 mM Tris-HCl pH 7, 100 mM NaCl) was carried out at 4°C to remove excess PLP. The PLP-saturated protein was rescued the next day and concentrated to 14 mg/mL. The protein yields were of 6.7, <1, <1 and 3.3 mg/L respectively for ArgB, ArgC, ArgD and ArgF. All proteins were flash frozen in liquid N 2 and stored at À80°C.

Differential scanning fluorimetry
Fragment screening was carried out in a 96-well PCR plate using a CFX Connect Real-time PCR Detection System (Bio-Rad) for DSF.
For ArgC and ArgD, the 25 lL reaction volume consisted of the following: 2.5 lM ArgC/5 lM ArgD, 100 mM sodium phosphate pH 7, 200 mM NaCl, 5x SYPRO Orange, and 5 mM fragments (960 fragment library). The protocol implemented increased temperature by 0.5°C after every 30 s, going from 25°C to 95°C and measuring SYPRO Orange fluorescence for each temperature cycle. The melt curve RFU (relative fluorescence units) and derivative -d(RFU)/dT values were analysed and plotted using a macros-enabled Excel Workbook: the minima of the melt curves were recorded as the melting temperature (T m ) of the enzymes in the presence of each fragment. The T m of the reference control (protein in the presence of DMSO) was subtracted from all the readings to calculate DT m .

Surface plasmon resonance
Low molecular weight (LMW) screening with the DSF fragment hits was carried out using the T200 Biacore instrument (GE Healthcare Life Sciences). A series S CM5 sensor chip (GE Healthcare Life Sciences) was used for the immobilisation of ArgC on the carboxymethylated dextran matrix through amine coupling. A 25 lg/mL ArgC dilution was prepared in the optimal coupling buffer (sodium acetate pH 5), and immobilisation was performed by manual instructions. The immobilised ArgC was tested using dilution series (19 lM to 2.5 mM) of NADP + and NADPH. 50 mM fragment DMSO stocks were used to prepare 1 mM dilutions in the SPR buffer consisting of 10 mM sodium phosphate pH 7.0, 150 mM NaCl and 2% DMSO (v/v). Each NADP + /NADPH and fragment dilution was injected once at a flow rate of 30 lL/min for a contact time of 30 s, SPR Running buffer (10 mM sodium phosphate pH 7.0 and 150 mM NaCl) was passed for 320 s at the same flow rate, and 50% DMSO v/v (diluted in SPR running buffer) was injected at the end of the cycle to remove undissociated analyte. Solvent correction was carried out to account for DMSO mismatch between the analyte dilutions and the SPR running buffer.
A 30 lg/mL ArgD dilution was prepared in sodium acetate pH 4 buffer for immobilisation. Following the ethanolamine neutralisation step, 1 mM PLP (prepared in the SPR buffer) was injected for a contact time of 420 s to ensure saturation of PLP-binding sites.
The immobilised ArgD was tested using dilution series (19 lM to 2.5 mM) of L-glutamate, N-acetylornithine and L-ornithine. Screening was also carried out against PLP-unsaturated ArgD.

Ligand-observed NMR
All NMR experiments were carried out at 25°C using a Bruker Avance 600 MHz spectrometer with a Triple Resonance Inverse (TCI) Automatic Tuning and Matching (ATM) cryoprobe. T2 relaxation-filtered one-dimensional NMR spectroscopy experiments incorporated a CPMG67 spin-lock time of 200 ms before the acquisition period. Samples (600 mL) containing 2 mM fragment in the absence and presence of 10 mM ArgB were prepared in buffer containing 20 mM potassium phosphate at pH 7.4 and 50 mM NaCl. Additionally, 2% d6-DMSO v/v was present in all samples for fragment solubilisation and field-frequency locking. Displacement experiments were carried out in the same manner by adding 1 mM each of ATP, and N-acetyl-L-glutamic acid or Larginine to the samples containing 2 mM fragment and 10 mM ArgB. The samples were loaded into 5 mm NMR tubes (Wilmad, 526-PP) for measurement, and the resulting spectra were analysed using TopSpin v. 3.5 (Bruker).

Crystallisation of the apoenzymes
Crystallisation screening and optimisation for all the enzymes was performed at 18°C with the sitting drop vapour diffusion method using a Mosquito robot (TTP-Labtech) to set up the crystallisation experiments. For apo ArgB, 300 nL of pure protein at 10 mg.ml À1 was mixed with an equal volume of reservoir solution and equilibrated against 85 mL of the reservoir solution. The selected condition was obtained in Wizards Classic 1&2 crystallisation screen (Rigaku), well G5 (1260 mM ammonium sulphate, 100 mM CHES pH 9.5 and 200 mM NaCl). Crystals appeared after 2 days in this condition and diffracted up to 1.8 Å resolution. For ArgF, 200 nL of pure protein at 20 mg.ml À1 was mixed with an equal volume of reservoir solution. An initial crystallisation condition was identified I the Wizards Classic 3&4 crystallisation screen (Rigaku), well F2 (40% PEG400, 100 mM Tris-HCl pH 7.5 and 200 mM Lithium sulphate). However, crystals obtained in this condition contained two lattices with different orientations and no structural solution could be found despite good quality diffraction. These crystals were ground to produce seeds and a new crystallisation screening was performed using 200 nL of ArgF at 20 mg.ml À1 mixed with equal volume of reservoir solution and 50 nL of seed solution. After several rounds of optimisation, the final crystallisation condition consisted of 150 mM ammonium dihydrogen phosphate and 10 mM praseodymium acetate. Crystals appeared after 2 days and diffracted up to 1.8 Å resolution.
A previously reported crystallisation condition for ArgC [29] was reproduced with some modifications, but the crystals were found to not be suitable for fragment soaking experiments. A new crystallisation screen was therefore performed using 200 nL of pure ArgC at 6.5 mg.ml À1 mixed with an equal volume of reservoir solution. Well A8 from the BCS screen (Molecular Dimensions) produced crystals diffracting to 1.54 Å. This condition consisted of 0.1 M phosphate/citrate buffer pH 5.5 and 20% PEG Smear High (PEG 6 K, 8 K, 10 K) and was optimised to remove the cryoprotection step by adding 20% glycerol (v/v). A second condition with a neutral pH, more amenable to soaking, based on the previous condition was also optimised and consisted of 0.1 M Bis-Tris pH 7, 17% PEG Smear High, 70 mM phosphate/citrate pH 5.6, 20% glycerol.

Crystal soaking and co-crystallisation with natural ligands and fragment hits
To obtain ligand-bound structures, soaking was performed in the crystallisation conditions described above for each protein using the hanging drop vapour diffusion method.
For ArgB, 1.5 mL of protein storage buffer containing 20 mM of ligand was mixed with 1.5 mL of reservoir solution, and drops were left to equilibrate against 500 mL of reservoir solution for 3 days. Crystals were then transferred to the drops and incubated for 16 h. A cryogenic solution was prepared by adding ethylene glycol up to 27.5% (v/v) to the mother liquor. Crystals were briefly transferred to this solution, flash-frozen in liquid N 2 , and stored for data collection. To obtain an ArgB-NAG complex, co-crystallisation with 2 mM NAG was performed instead. Crystals for ArgB-NAG complex were obtained in Wizard Classic 1&2 screen (Rigaku), solution B6, and were flash-frozen in liquid N 2 after a brief soak in a solution containing mother liquor and 27.5% ethylene glycol.
ArgC crystals grown in pH 5.5 were first soaked in 1.5 mL drops containing the mother liquor and 5 mM NADP + for 2 h in hanging drops that were equilibrated against a reservoir of 500 mL. Thereafter, the crystals were transferred to drops containing the crystallisation condition and an SPR-validated fragments (20 mM, 10% DMSO), which were equilibrated against 500 mL of mother liquor also containing a corresponding percentage of DMSO overnight at 19°C. ArgC crystals grown in pH 7 were soaked with 5 mM NADP + only for 5-10 min due to the rapid development of cracks, and transferred to the fragment soaking drops for 5-10 min from where they were fished and frozen.
ArgD crystals were soaked with fragments at a concentration of 50 mM overnight in otherwise the same manner as ArgC crystals grown in pH 5.5. A cryogenic solution was prepared by adding 30% ethylene glycol to the mother liquor. Crystals were briefly transferred to this solution and flash-frozen in liquid N 2 .
ArgF crystals were soaked in drops containing crystallisation condition and 20 mM of ligand and equilibrated against 500 mL of reservoir solution for 16 h. A cryogenic solution was prepared by adding 25% ethylene glycol to the mother liquor. Crystals were briefly transferred to this solution and flash-frozen in liquid N 2 .

X-ray data collection and processing
X-ray diffraction data (single-wavelength anomalous diffraction) were collected on beamlines i02, i03, i04, i04-1 and i24 at the Diamond Light Source (DLS), UK and on id30B at The European Synchrotron Radiation Facility (ESRF). Diffraction data were processed and reduced with autoPROC from Global Phasing Limited [30] or Xia2 [31]. The apo-form of ArgB was crystallised in the R3 2 space group with one protomer per asymmetric unit (ASU) and the ArgB:NAG complex in the P6 3 space group with two protomers per ASU. ArgF was crystallized in the P2 1 space group with 6 protomers in the ASU. ArgC was crystallised in C2 and P2 1 space groups, with 2 and 4/8 protomers per ASU respectively. ArgD was crystallised in the P2 1 space group as well but with 4 protomers in the ASU.
Initial phases were determined with PHASER [32] from PHENIX software package [33] using the M. tuberculosis ArgB structure (PDB: 2AP9), M. tuberculosis ArgF structure (PDB: 2P2G), M. tuberculosis ArgC structure (PDB: 2I3G) and E. coli Succinyl-ornithine transaminase (AstC, 42% sequence identity, PDB: 4ADB) as a search models, respectively for ArgB, ArgF, ArgC and ArgD. Model building was done with Coot [34], and refinement was performed in PHE-NIX [33,35] for ArgB, ArgF, and ArgC. For ArgD, after the initial molecular replacement solution and a single cycle of refinement, PHENIX AutoBuild was used to generate a model for M. tuberculosis ArgD that was then refined with Coot and PHENIX. Structure validation was performed using Coot and PHENIX tools [33,34]. All figures were prepared with PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrodinger, LLC).

Isothermal titration calorimetry
The binding interaction between ArgB or ArgF and ligands was characterised at 25°C, using a MicroCal ITC200 titration calorimeter (Microcal). An ArgB concentration between 75 and 150 mM was used for all titrations. Ligands (0.75-2 mM) were injected in 1.5 mL aliquots with a 150 s spacing between injections for compound 1 and 110 s spacing for all the others. For compound 2, Lcanavanine and L-arginine two titrations were concatenated. Titration data was recorded in 25 mM HEPES pH 7.4 with 200 mM NaCl. Data were analysed by fitting a simple single-site model using Origin software (Microcal) (NMR711 and NMR446) or a six-site sequential binding model (ArgB and L-canavanine).
ArgF was dialysed in 50 mM HEPES pH 8.0, 200 mM NaCl before it was loaded into the calorimetry cell at concentrations of 75-100 lM with the addition of 1 mM DTT. Ligand solutions at concentrations of 1 mM were dissolved in the same buffer and typically injected between 0.5 lL and 2 lL at 150 s intervals with stirring at 750 rpm. Buffer-ligand titrations were carried out as reference runs and subtracted from the protein-ligand titration to remove the heat of dilution. Data were analysed by fitting a simple single-site model using Origin software (Microcal).

Enzymatic assays
ArgB activity was assessed by a colorimetric assay that followed the release of ADP by measuring the oxidation of NADH at 340 nm, for 30 min, in the presence of pyruvate kinase and lactate dehydrogenase in a PHERAstar plate-reader (BMG-Labtech). The enzymatic reactions (200 mL) were performed at 25°C and contained 50 mM Tris pH 7.5, 200 mM NaCl, 50 mM KCl, 10 mM MgCl 2 , 0.3 mM NADH, 2.5 mM phosphoenolpyruvate, 0.3 mM ATP, 1.25 mM N-a cetyl-L-glutamate (NAG), 10% DMSO (v/v), 4 units of pyruvate kinase/lactate dehydrogenase, 0.5 mM ArgB, and varying concentrations of inhibitors. Inhibitors were also individually screened against the coupled enzymes to eliminate any compounds interfering with the other assay components. Competition assays were performed in the same conditions using 0.3 mM of ATP or NAG and varying the other substrate concentration.
To synthesise the ArgC substrate, a reaction mixture containing 3 lM ArgB, 1 mM NAG and 1 mM ATP in 50 mM Tris-HCl pH 7.5, 100 mM NaCl and 40 mM MgCl 2 was made and the reaction was allowed to proceed for 1- All conditions were prepared in triplicates. 3.5 lM ArgC was added just before measurements were started.
To assess the activity of ArgD an end point assay that follows the reverse reaction of the enzyme was used in the presence and absence of fragments. Reactions (100 lL) were set up containing the following: 100 mM Tris-HCl pH 8, 3 mM N-acetylornithine (NAO), 3.4 mM a-KG, 20 lM PLP and 4 lM enzyme. Duplicate reaction mixtures were prepared for each time point and 8 time points (0, 5, 10, 15, 20, 30, 45, 60 min) were tested in total. Two controls were prepared: one with all reaction components except the enzyme, and the other with all reaction components except NAO (NAO was added after HCl treatment). The reactions tubes were allowed to equilibrate in a heating block set at 37°C for two minutes, and the enzyme was added to initiate the reaction. After the stipulated incubation times, contents of the reaction tubes were quickly transferred to 1.5 ml microcentrifuge tubes containing 60 lL of 6 N HCl to stop the reactions. These tubes were then kept in a heating block set at 95°C for 30 min, after which they were cooled to 25°C in a water bath. 200 lL of 3.6 M sodium acetate was added to each tube (final concentration of 1.8 M), along with 40 lL of 30 mM 2-aminobenzaldehyde (final concentration of 3 mM in a total volume of 400 lL). A yellow colouration started developing as soon as the latter was added, the contents were vortexed and the tubes were incubated at 25°C for 15 min. 200 lL of each reaction mixture was transferred into wells of a 96-well flat bottom UV transparent microplate, and absorbance at 440 nm was measured using the PHERAstar plate reader. All experiments were performed at least in triplicate in a PHERAstar plate-reader (BMG-Labtech) and the control without ArgD was used for blank subtraction. Data were analysed with GraphPad Prism (Graphpad Software). All reagents were obtained from Sigma-Aldrich.

M. tuberculosis culture condition and minimum inhibitory concentration (MIC) determination
Mutant strain DargB and its complemented strain DargB-c were generated before as mentioned [4] All the strains, wild type M. tuberculosis H37Rv, mutant DargB and its complemented strain DargB-c [4] as well as multidrug resistant (MDR) strain V2475 [36] were grown at 37°C to mid-log phase in Middlebrook 7H9 medium supplemented with 10% oleic acid-albumin-dextrosecatalase (OADC), 0.5% glycerol, and 0.05% Tyloxapol supplemented with arginine (1 mM) washed and suspended in 7H9 media ± arginine (1 mM). For MIC, the cultures were diluted to 1/500 in ± arginine supplemented media. Serial two-fold dilutions of each drug were prepared directly in a sterile 96-well plate using 0.1 ml of media with the appropriate supplement in the presence or absence of 1 mM arginine. Same media with only vehicle (no drug) was used as a control. PBS (0.2 ml) was added to all the perimeter wells. The diluted M. tuberculosis strains in ± arginine supplemented media (0.1 ml) were added to each well, and the plate was incubated at 37°C for 7 days. Cell growth was measured by optical density at 600 nm. An aqueous solution of resazurin (0.2 mg/ml; 0.03 ml) was added to each well, and the plate was further incubated for up to two days at 37°C. The MIC was determined as the lowest concentration at which the change of colour from blue (resazurin) to pink (resorufin) did not occur.

DSF fragment screening
In the first stage of the screening, differential scanning fluorimetry was used to screen an in-house library of 960 rule-ofthree compliant fragments against four enzymes of the arginine biosynthesis pathway ArgB, ArgC, ArgD and ArgF. In the case of ArgB, ArgC and ArgF the screening was performed against the apoenzymes while for ArgD it was done with the PLP-bound form since PLP binds covalently to the enzyme and it does not leave the active site during catalysis. Several known ligands (substrates, products, allosteric regulators and co-factors) were used as positive controls for each of the enzymes (Tables 1-4, Fig. S1). In all cases, a fragment was considered a hit when the shift in melting temperature was greater than five times the standard deviation.

ArgB
In the conditions used in the assay, ArgB with 5% DMSO (v/v) displayed a melting temperature of 48°C. The addition of 1 mM ATP, N-acetyl-glutamate and L-arginine showed positive melting shifts of 2.0, 8.6 and 11.8°C respectively (Table 1, Fig. S1a). Of the 960 compounds, a total of 63 (%6.6%) showed a thermal shift greater than five standard deviations (!1.25°C) at 5 mM and were considered hits. Out of those, 14 showed a large stabilization of ArgB with a thermal shift greater than 5°C (Table 1) and were selected for further validation.

ArgC
The apoenzyme ArgC with 5% DMSO (v/v) was found to have a melting temperature of 67.7°C, while 1 mM NADP + gave a small average positive thermal shift of~0.6°C and 1 mM NADPH produced an average negative thermal shift of À0.9°C across all DSF runs performed (Table 2, Fig. S1b). 81 out of the 960 fragments screened (%8.5% of the library) at a concentration of 5 mM gave a positive thermal shift greater than five standard deviations (!2.9°C). An orthogonal biophysical technique, Surface Plasmon Resonance (SPR), was also employed to corroborate these hits ( Table 2).

ArgD
ArgD contains the prosthetic group PLP that does not leave the active site of the enzyme. Therefore, all the PLP sites must remain occupied during screening experiments to represent the native state of the target. In an attempt to saturate all of the ArgD PLP sites, PLP was added to ArgD during the purification of the protein.
Further confirmation was required to assess if most of the sites were now saturated. To examine this, the melting profiles of unsaturated ArgD, saturated ArgD and the two forms in the presence ND -not determined. à -Attempts were made to determine the K d for these ligands but without success. * -L-arginine and L-canavanine ITC data could only be fitted with a sequential binding and the best fit was a six-site model with K d of 4.5 ± 1.0, 4.8 ± 1.0, 6.1 ± 0.9, 9.4 ± 1.9, 12.9 ± 2.7 and 42 ± 8 mM for L-arginine and 4.5 ± 1.2, 4.8 ± 1.0, 6.4 ± 1.2, 34 ± 6.7, 35 ± 5.1 and 137 ± 26 mM for L-canavanine. # -An inhibition % at 2.5 mM is given in Table S2 for compounds which an IC 50 was not possible to obtain.  1 mM PLP were assessed. Unsaturated ArgD exhibited a melting profile with 2 peaks, a large and broad peak with melting temperature of 64.5°C, and a small peak with a melting temperature of 77°C (Fig. S1c). The addition of 1 mM PLP changed the melting profile to a single peak with a melting temperature to 77.5°C (Fig. S1c). ArgD that had PLP added during purification showed a melting temperature of 77.5°C (Fig. S1c). The addition of 1 mM PLP had now a very minor stabilizing effect shifting the melting temperature to 79°C (Fig. S1c) and confirming that most PLP sites were saturated. However, the protein in this state was insensitive to fragment binding with maximum thermal shifts of 0.5°C being observed. We therefore tested the potential of using the unsaturated protein for the screening. As mentioned above, PLPunsaturated ArgD exhibited a melting profile with 2 peaks, a large and broad peak with melting temperature of 64.5°C (Fig. S1c), likely a mixture of two different populations in which none or one of the two protomers contain PLP, and a small peak with a melting temperature of 77°C (Fig. S1c) which most likely corresponded to a PLP-saturated population. The fact that the addition of 1 mM PLP shifts the melting profile to a single peak with a melting temperature of 77.5°C corroborated this hypothesis. Three different types of response in the melting profile to the presence of fragments were observed while screening the PLPunsaturated ArgD (Fig. S1c). Most fragments showed either no effect on the melting profile or a decrease in the melting temperature of the large peak or of both peaks and were discarded (Fig. S1c). A second set caused a change in the melting profile with a large increase in the intensity of the highest temperature peak (Fig. S1c) suggesting that the fragment was preferentially binding to the PLP binding site. The third set, had fragments that led to an increase in melting temperature inferior to 4°C but maintained the melting profile of the PLP-unsaturated ArgD control (Fig. S1c). Fragments of this set could either be binding to the PLP site but not stabilizing the protein sufficiently to show a clear change in the melting profile, or could be binding elsewhere on the protein both in the presence or absence of PLP. This set, represented by 47 fragments (%4.9% of the library) giving melting shifts of at least five standard deviations (!2°C) were therefore selected for SPR validation (Table 3).

ArgF
ArgF displayed a melting temperature of 63.5°C with 1 mM Lornithine and L-citrulline showing a positive melting shift of 1.5 and 1°C respectively. Of the 960 fragments, a total of 105 (%10.9%) showed a thermal shift greater than five times standard deviations (!1.0°C) at 5 mM and were considered hits (Fig. S1d). Out of these 16 displayed a melting shift greater than 3°C and were selected for further validation by X-ray crystallography. Due to the large number of hits for this protein, greater than 10% of the whole library, a clustering analysis of the fragment hits was performed. Centroids for each identified cluster and the representative displaying the highest melting shift were also selected for X-ray validation (Table 4).

Validation of DSF hits using a secondary screening technique
Three different strategies were employed to validate the hits obtained with DSF. For ArgF, the fragment hits were taken directly for X-ray crystallographic validation, for ArgB ligand-based NMR was employed and for ArgC and ArgD SPR was performed to validate the DSF hits.

ArgB
To validate the ArgB hits obtained by DSF, two ligand-based NMR methods, Carr-Purcell-Meiboom-Gill (CPMG) and STD (saturated transfer difference), were used [37,38] and fragments that were validated by at least one method were considered as confirmed hits. CPMG experiments validated 15 out of the 16 fragments while STD validated 11 out of 16. Only one fragment was not validated by either of the methods and thus was not taken forward.

ArgC and ArgD
ArgC and PLP-saturated ArgD were immobilised on an activated carboxymethylated dextran surface via amine coupling and final immobilisation response achieved was %16000 RU for ArgC and %7500 RU for ArgD (1000 RU roughly corresponding to 10 mg/ mL protein immobilised on the surface). For ArgD, a 1 mM PLP injection after immobilization increased the absolute baseline immobilisation response by %1100 RU. This was done to compensate for loss of PLP in the low pH coupling buffer during immobilisation. The baseline throughout the screening experiment remained at the post-PLP injection level (%8570 RU), indicating that PLP was not lost during the experiment.
The immobilised ArgC protein was tested first using dilution series of NADPH and NADP + (19 lM to 2.5 mM), and a clear dose response suggested that predominantly, the enzyme had not been immobilised in an orientation that occluded the active site. Similarly, ArgD was tested using a dilution series of L-glutamate, Nacetylornithine and L-ornithine (19 lM to 2.5 mM).
The fragment hits obtained previously were injected at a concentration of 1 mM. The sensorgrams for all the analytes were inspected visually to exclude fragments with either no discernible response or a ''sticky" profile from further analysis (representative sensorgrams for fragment binders are shown in Fig. S2a and b). The binding level was calculated using the T200 software and adjusted for molecular weights of the analytes (Fig. S2c and d). For ArgC, 22 fragments with a binding level response !20 RU were shortlisted, whereas in the case of ArgD, 20 fragments with a response !10 RU were shortlisted for crystallographic validation. Fragments were thereafter described either as SPR 'positive' or 'negative'.

Crystallographic, biophysical and biochemical validation
The hits obtained for ArgB, ArgC, ArgD and ArgF were then soaked in crystals of the respective protein and X-ray diffraction data was collected. Data collection and refinement statistics for all structures are available in Table S3.

ArgB
We obtained crystal structures for ArgB in the apo form and with the natural ligands N-acetyl-glutamate (NAG) and Larginine (PDB: 7NLF, 7NLN, and 7NLO respectively) ( Fig. 2a and  S3). L-canavanine, the guanidinooxy structural analogue of L-arginine, was found to bind to the same site of ArgB as L-arginine and induced conformational changes similar to those induced by L-arginine (Fig. S4). The thirteen crystal structures of ArgB with fragments in the absence of natural ligands show that all the compounds were unexpectedly bound to a hydrophobic cavity at the interface between two protomers with three of these sites present in the ArgB hexamer ( Fig. 2b-d, S3 and S5). This interfacial site is composed by the residues ala124, val125, gly126, ile127, asp131, ala132, leu134, ala164, met165, leu168 and arg173 and is mostly hydrophobic in nature (Fig. S6a). Due to the nature of this new site, the interactions between the compounds  Fig. 2c and d). Carbon-p interactions are also formed with leu168 for both compounds (Fig. 2c and d). Finally, weak hydrogen bonds are also present between asp167, leu168 and val125 for NMR711 while NMR446 interacts with ile127 and leu168 via hydrophobic contacts (Fig. 2c and d). Furthermore, this site is symmetrical and sits at a 2-fold crystallographic symmetry axis with each compound clearly presenting two binding conformations (Fig. 2e, S3 and S5). Due to the 2-fold symmetry of the site the two different binding conformations likely represent the real interactions of each compound with the protein with neither conformation being dominant. The two compounds (NMR446 and NMR711) also share structural and binding features with a trifluoromethyl group occupying the same position at the binding site.
Enzymatic assays show that, of the 13 compounds, NMR711 and NMR446 were the best ArgB inhibitors, with an IC 50 of 366 and 707 mM, respectively ( Fig. 3a and Table 1). The natural allosteric regulator L-arginine and its analogue L-canavanine (Fig. 3a) have IC 50 values of 186 mM and 1.46 mM respectively. Additionally, ITC experiments showed that compounds NMR711 and NMR446 bind to ArgB with a K d of 7.7 and 23 mM, respectively (Table 1 and Fig. S7), whereas L-arginine and L-canavanine showed complex binding curves that can only be fitted to a sequential binding model showing a cooperative interaction with the different protomers of the hexamer (Table 1 and Fig. S7). NMR competition assays revealed that compounds NMR711 and NMR446 are not competitive with any of the natural ligands (ATP, NAG, and Larginine), with enzymatic assays also demonstrating the noncompetitive nature of ArgB inhibition by both fragments (Fig. S8). This confirms that the results from X-ray crystal structures are not an artefact and that the fragment-binding site is indeed a new allosteric site. The observation that there are no conformational changes in the crystal structures of ArgB in any of the ArgB:fragment complex structures at the obtained resolutions (2-2.5 Å) may be due to constraints arising from crystal packing. Bioinformatics analysis showed that this site is conserved in mycobacterial species and also in closely related actinobacteria, such as Nocardia (Fig. S6b). Nevertheless, it is clear that binding of these compounds, similar to L-arginine binding, causes changes in the energy landscape of the protein that result in allosteric inhibition of the catalytic reaction.

ArgC
Crystal structures were obtained for the ArgC apoenzyme (PDB: 7NNI) and the NADP + -bound holoenzyme (PDB: 7NNQ) (Fig. 4a). Binding of NAPD + causes significant structural changes in two loops of the protein that move from a closed to open conformation (Fig. 4a). Additionally, structures were solved with 4 fragment binders, occupying either of the two distinct pockets: the substratebinding ( Fig. 4c and d) and the NADP(H)-binding pockets (Fig. 4e and f). Fragments NMR322 (5-Methoxy-3-indoleacetic acid) and NMR 571 (Xanthene-9-carboxylic acid) were observed in the substrate-binding pocket (PDB: 7NOT and 7NNR respectively) ( Fig. 4c-d and S3). Both NMR322 and NMR571 engaged the side chains of residues his217 and tyr211 through hydrogen bonds. These two residues are predicted to stabilise the acyl-enzyme intermediate during catalysis. NMR322 also made an H-bond interaction with ser186 and gly187. As compared to NMR571, NMR322 binds deeper in the pocket (Fig. 4c and d).
Fragments NMR401 (6-phenoxy-3-pyridinamine) and NMR863 (5-methoxy-1,3-benzoxazole-2-carboxylic acid) were observed in the ribosylnicotinamide and pyrophosphate regions of the NADP (H)-binding pockets (PDB: 7NPJ and 7NPH respectively) and the majority of the interactions between the protein and these two fragments are hydrophobic or p-interactions. Both fragments form only a single hydrogen bond interaction with the thr325 side chain in case of NMR401 ( Fig. 4e and S3) and with the arg193 backbone amine in the case of NMR863 (Fig. 4f and S3).
Enzymatic assays revealed that 2 mM NMR322 inhibited ArgC activity by 45% whereas 2 mM NMR571 caused a 37% inhibition. 2 mM NMR401 inhibited ArgC activity by 10% whereas 2 mM NMR863 caused a 12% inhibition. Although the thermal shifts obtained for NADP(H)-binding pocket fragments NMR401 and NMR863 were higher than those for substrate-binding pocket fragments NMR322 and NMR571, the SPR binding response for the latter was better and positively correlated with percentage inhibition of enzymatic activity (Table 2).

ArgD
The first crystal structure of the ArgD holoenzyme from M. tuberculosis obtained (Fig. 5a) showed that the protomer has three domains: the smaller N-terminal segment (residues 7 to 74), the relatively larger C-terminal domain (residues 286 to 396), and the central PLP-binding domain (residues 85 to 273), which is also the largest and has a Rossmann-like overall fold (Fig. 5b and S3). The prosthetic group PLP is covalently linked to lys253 via an aldimine linkage. ArgD is a dimeric enzyme like other members of the class III d-aminotransferase family [39]; the active sites are interfacial, and residues of both protomers contribute to the active site architecture (Fig. 5a) (PDB: 7NN1). Fragment NMR608 (3-Hydroxy-2-naphthoic acid) was observed occupying all four binding sites (4 chains in the ASU). It binds closer to the distal phosphate group of PLP than the proximal ring system and the internal aldimine bond, where catalysis actually occurs. However, it makes through its acid group, hydrogen bond interactions with arg142, which could be the key residue involved in substrate binding based on homology with AstC, and with an highly coordinated water through the hydroxyl group ( Fig. 5c  and S3). While the pocket itself has depth, the fragment is more solvent-exposed than the PLP molecule. Although NMR608 was the only hit validated by all techniques, the fragment library contained a compound with similar structure (NMR868), which had only a melting shift of 0.9°C and therefore was not selected initially for further testing. However, soaking of this fragment revealed that it binds in almost the same position as NMR608 ( Fig. 5d and S3). Nevertheless, the orientation of the acid group is slightly different for NMR868 and it no longer interacts directly with arg142. In fact this fragment only has hydrogen bond interactions with solvent molecules and it is only observed in one molecule in the ASU out of four, most likely reflecting a lower affinity (PDB: 7NNC).
Aminotransferases are often assayed for the reverse reactions they catalyse because in most cases substrates for the forward reaction are not commercially sold. The ArgD holoenzyme can also use N-acetylornithine (NAO) and a-ketoglutarate (a-KG) to generate N-acetyl-c-glutamyl-semialdehyde and glutamate. The semialdehyde product spontaneously cyclises into D 1 -pyrroline-5-carboxylic acid and can react with the reagent 2-aminobenzaldehyde to yield a dihydroquinazolinium compound (bright yellow colouration) that absorbs at 440 nm [40,41]. This assay was employed to assess the effect of the fragment hits on ArgD. NMR608 exhibited very weak activity with only 31% and 16% inhibition observed at 4 and 2 mM respectively. This is consistent with the crystal structures where the fragment is highly exposed to the solvent.

ArgF
Crystal structures of apo ArgF and of ArgF in complex with the natural ligand carbamoyl phosphate were initially obtained (PDB: 7NNF and 7NNV respectively). Carbamoyl phosphate interacts with ArgF through hydrogen bonds with the side chains of ser50, thr51, arg52, arg101, his128 and gln131 but also with the backbone amines of thr51, arg52 and thr53 and the carbonyl of cys264 ( Fig. 6a and S3). The residues ser50, thr51, arg52 are at the N-terminus of a-helix 2 and the phosphate group of carbamoyl phosphate sit at the positive pole of the helix. Binding of carbamoyl phosphate to ArgF slightly displaces a-helix 2 when compared to the apo structure (Fig. S9a). Soaking ArgF with fragment hits yielded eight crystals structures. All fragments occupied a site at the interface between two protomers of the ArgF trimer but not all sites are equally occupied by all fragments (Fig. 6b-e and S9bf). This site contacts directly with the a-helix 2 and sits between this helix and a-helix 3 of the opposing protomer. The site is formed by residues thr51, arg52, phe55 of a-helix 2, leu265, ala289, arg292 of protomer 1 and ile45, ser75, thr76, leu78, glu82, thr87, leu91 and tyr94 of protomer 2. These residues form a cavity that opens to the carbamoyl phosphate binding site. Binding of fragments at this site, shifts the position of a-helix 2 by 1.9 Å, when compared to the carbamoyl phosphate structure, and also affects the conformation of arg52 which is involved in carbamoyl phosphate binding. Fragments were found to bind in two distinct sub sites across the main binding site and could be divided in three different groups based on their mode of binding. NMR007, 078, 464 and 502 occupied the top area of the site (subsite 1) and contacted with a loop of protomer 2 that is composed by residues asp72 to leu84 that covers the site (Fig. 6c and S9b-d). NMR801, the single representative of this group sits at the bottom of the site (subsite 2) between a-helix 2 of protomer 1 and a-helix 3 of the opposing protomer (Fig. 6d). NMR288, 812 and 817 have two molecules binding at this site, with one molecule occupying each subsite ( Fig. 4E and S2E-F). All fragments keep a-helix 2 in a position close to the apo structure or move further away from the position this helix occupies when carbamoyl phosphate is bound, albeit very slightly. A K d value could only be determined for NMR007 and NMR812 and both fragments showed affinities worse than 100 mM (Table 4 and Fig. S10).

Effect of ArgB inhibitors in M. tuberculosis growth
Considering the four enzymes of the arginine biosynthesis pathway screened, ArgB hits exhibited higher potency by far, with NMR711 and NMR446 being selected to assess their effect on M. tuberculosis together with L-canavanine. The ability of these compounds to inhibit M. tuberculosis growth was examined by measuring their minimum inhibitory concentrations (MICs) in the absence or presence of arginine (1 mM). All the compounds inhibited the growth of M. tuberculosis H37Rv, DargB-c as well as MDR strain V2475 in media without arginine compared to no drug control (Figs. 7 and 8). MICs for NMR711, NMR446 and L-canavanine were 25-50, >200 and 50 mg/ml against H37Rv, DargB-c and 25-50, >200, >200 mg/ml against V2475 (Figs. 7 and 8). However, when arginine (1 mM) was present in the media, compound NMR446 and L-canavanine had no inhibitory activity against H37Rv, DargB-c and, DargB. This indicates that these compounds are indeed specifically inhibiting M. tuberculosis arginine biosynthesis (Fig. 7). In contrast, the more promiscuous NMR711 was inhibitory for all the above strains in the presence or absence of arginine, suggesting that NMR711 may target additional proteins ( Figs. 7 and 8).
Using the FBDD approach, we have discovered inhibitors that bind to a new allosteric site in ArgB, which has very different properties than that of the active site and L-arginine binding sites, thus opening new possibilities for drug discovery by targeting ArgB. For fragment-sized molecules, both compounds reported in this work bind tightly and allosterically to ArgB and have growth inhibitory activity against M. tuberculosis, suggesting that they have the potential to provide a framework for developing larger and higher affinity molecules against the ArgB protein.

Discussion
The arginine biosynthesis pathway has been established as a good target for anti-TB drug discovery [4,14]. Arginine deprivation in M. tuberculosis induced by knocking out argB and argF results in both in vitro and in vivo sterilisation of M. tuberculosis, without the emergence of suppressor mutants [4]. However, from a pathway with eight enzymes, only ArgJ has been explored in a drug discovery campaign and all other enzymes of the pathway, prior to this work, were yet to be assessed in their potential as suitable targets for drug discovery.
Fragments are potent chemical tools that can efficaciously explore the surface of proteins for new binding sites and their chemical space, even with small libraries of a few hundreds of compounds and can consequently be employed to assess the ligandability of protein targets [18,42]. Therefore, this approach was employed to assess the ligandability of ArgB, ArgC, ArgD and ArgF, and to identify potential starting points for fragment elaboration.
We have screened a fragment library of 960 small compounds (MW 150-300 Da) initially using DSF and subsequently employed ligand-based NMR, SPR, ITC, biochemical assays and X-ray crystallography to validate the DSF hits. Due to the nature of FBDD, a hit is only considered validated when an X-ray crystal structure is obtained. For all the proteins in this work, hits were identified and eventually validated by X-ray crystallography. ArgB had the highest number of X-ray validated hits with a total of thirteen, followed by ArgF with eight, ArgC with four and ArgD with two. Interestingly, in the case of ArgB and ArgF, all the fragment hits were binding to an interfacial site, which in the case of ArgB was confirmed to have functional implications. In the case of ArgF, its close proximity to the active site shows potential to develop compounds that can anchor at the interfacial site and then extend towards the active site to interfere with catalysis. Similarly for ArgJ, the only enzyme of the pathway with known inhibitors prior to this work, the inhibitors were also found to bind to an interfacial allosteric site [14]. Our results further show that in the case of ArgC there are two possible strategies to develop inhibitors, with one targeting the cofactor binding site and the other the substrate binding site. It is not clear at this point which strategy has the highest potential to result in potent inhibitors. Another consideration to take into account is the extent of homology of these enzymes with the human orthologue. While ArgB and ArgC do not have a human orthologue, ArgD and ArgF do and the M. tuberculosis enzymes have identities of 36 and 41% with the human orthologues, respectively. Nevertheless, while the ArgF active site is conserved, the interfacial site of ArgF contains several differences which can be exploited for developing specific inhibitors for the M. tuberculosis enzyme. For ArgD selectivity might be more difficult to achieve since many of the active site residues are conserved in comparison with the human cytoplasmic and mitochondrial enzymes.
Due to the potency of the best fragments against ArgB, we tested them for their ability to inhibit M. tuberculosis growth together with L-canavanine. Remarkably, NMR446 and Lcanavanine not only inhibited M. tuberculosis growth, but were also found to act on-target despite the potential promiscuity of such small compounds, with both becoming inactive after the addition of L-arginine to the media.
Despite these promising results, the interfacial site of ArgB might be the hardest of all sites found in this study to develop small molecule inhibitors. The intrinsic highly hydrophobic nature of the site together with very few opportunities to engage in hydrogen bonds and other polar contacts creates difficulties in rationalising what modifications could improve the potency of the compounds. Furthermore, the fact that we cannot observe conformational changes in any of the structures with fragments bound to ArgB may be due to constraints arising from crystal packing and thus these structures may not completely represent what occurs in solution. It is however also possible that binding to this site does not cause visible conformational changes but still alters the energy landscape of the intramolecular pathways involved in the catalytic cycle. We cannot currently determine which of these two hypotheses is correct.
In conclusion, by using a fragment-based approach, we have discovered inhibitors that bind to novel sites in ArgB and ArgF and to the active sites of ArgC and ArgD, which in case of ArgB show on target activity against M. tuberculosis. The data presented here clearly demonstrates the potentiality and scope to target enzymes in the de novo arginine biosynthesis of M. tuberculosis with dedicated drug discovery programs and we propose ArgC and ArgF as the best candidates for future drug discovery work.