Bioinformatic Analysis of Leishmania donovani Long-Chain Fatty Acid-CoA Ligase as a Novel Drug Target

Fatty acyl-CoA synthetase (fatty acid: CoA ligase, AMP-forming; (EC 6.2.1.3)) catalyzes the formation of fatty acyl-CoA by a two-step process that proceeds through the hydrolysis of pyrophosphate. Fatty acyl-CoA represents bioactive compounds that are involved in protein transport, enzyme activation, protein acylation, cell signaling, and transcriptional control in addition to serving as substrates for beta oxidation and phospholipid biosynthesis. Fatty acyl-CoA synthetase occupies a pivotal role in cellular homeostasis, particularly in lipid metabolism. Our interest in fatty acyl-CoA synthetase stems from the identification of this enzyme, long-chain fatty acyl-CoA ligase (LCFA) by microarray analysis. We found this enzyme to be differentially expressed by Leishmania donovani amastigotes resistant to antimonial treatment. In the present study, we confirm the presence of long-chain fatty acyl-CoA ligase gene in the genome of clinical isolates of Leishmania donovani collected from the disease endemic area in India. We predict a molecular model for this enzyme for in silico docking studies using chemical library available in our institute. On the basis of the data presented in this work, we propose that long-chain fatty acyl-CoA ligase enzyme serves as an important protein and a potential target candidate for development of selective inhibitors against leishmaniasis.


Introduction
Leishmaniasis is a disease caused by protozoan parasites of the Leishmania genus. Visceral leishmaniasis (VL), also known as kala-azar, is the most severe form of leishmaniasis (http://www.dndi.org/diseases/vl.html). With no vaccine in sight, treatment for kala-azar relies primarily on chemotherapy [1].
Phylogenetics suggests that Leishmania is relatively early branching eukaryotic cells and their cell organization differs considerably from that of mammalian cells [2,3]. Hence, the biochemical differences between the host and parasite can be exploited for identification of new targets for rational drug design. It is also imperative that the probability of developing drug resistance should be less with these targets. This can be achieved by targeting an essential cellular process, which has the pressure to remain conserved and cannot be bypassed by using alternative pathway.
Gaining new knowledge on fatty acid metabolism will not only provide fundamental insight into the molecular bases of Leishmania pathogenesis but also reveal new targets for selective drugs. Enzymes involved in fatty acid and sterol metabolism have been shown to be important pharmaceutical targets in Leishmania and other kinetoplastida [22]. Triacsin C, a specific inhibitor of long-chain fatty acyl-CoA synthetase, was shown to have an inhibitory effect on the growth of Cryptosporidium parvum in vitro [23].
Four fatty acyl-CoA synthetases have been described previously in Trypanosoma brucei, displaying different chainlength specificities [24,25]. The whole genome sequence of three Leishmania spp. (L. major, L. infantum, and L. braziliensis) has been sequenced, and the availability of putative long-chain fatty acyl-CoA ligase genes was present in all three Leishmania spp. at chromosome 13, which would be required for initiation of β-oxidation and fatty acid metabolism.
In the present study we confirm the presence of longchain fatty acyl-CoA ligase gene in Leishmania donovani clinical isolate collected from, the state of Bihar India [26][27][28][29], which alone accounts for 50% of the total burden of visceral leishmaniasis worldwide [30]. Further progress in the understanding of this enzyme is likely to be achieved through the whole genome sequence (WGS) project of these clinically important isolates [26][27][28][29], underway in our laboratory (http://www.leishmaniaresearchsociety.org/).

Collection of Clinical
Isolates. The clinical isolates of L. donovani were collected from two kala-azar patients selected from Muzaffarpur, Bihar. The criterion for visceral leishmaniasis diagnosis was the presence of Leishman-Donovan (LD) bodies in splenic aspirations performed, which was graded to standard criteria [30]. Response to sodium antimony gluconate (SAG) treatment was evaluated by repeating splenic aspiration at day 30 of treatment. Patients were designated as antimonial responsive (L. donovani isolate 2001) based on the absence of fever, clinical improvement with reduction in spleen size, and absence of parasites in splenic aspirate while patients who showed presence of parasites in splenic aspiration were considered to be antimonial unresponsive (L. donovani isolate 39) [26][27][28][29].

Sample Collection and
Nuclear DNA Isolation. L. donovani isolates 2001 (SAG-sensitive) and 39 (SAG-resistant) used in the present study, were maintained in culture as described previously in [26][27][28][29]. For nuclear DNA isolation 10-15 mL log-phase culture was taken and centrifuged at 5,000 rpm for 8 min at 4 • C. The supernatant was decanted; cell pellet was resuspended in 3-6 mL NET buffer and centrifuged at 5,000 rpm for 8 min at 4 • C. The supernatant was discarded, and the pellet was redissolved in 750 μL NET buffer, 7.5 μL proteinase K (10 mg/mL stock) (MBI, Fermentas, Cat No. EO0491), and 50 μL of 15% sarkosyl. Sample was incubated at 37 • C overnight for proteinase K activity. The cell lysate was centrifuged at 18,000 rpm for 1 hr at 4 • C. The supernatant containing nuclear DNA was transferred to a fresh tube and given RNase treatment (20 μg/mL) (MBI, Fermentas, Cat No. EN0531) at 37 • C for 30 min. DNA was extracted first with one volume phenol/chloroform/isoamyl alcohol (25 : 24 : 1) and finally with chloroform. Nuclear DNA was precipitated with 2.5 volumes of prechilled absolute ethanol, dissolved in nuclease-free water and stored at 4 • C for future use.

Phylogenetic
Analysis. The amino acid sequence of Leishmania long-chain fatty acyl-CoA ligase, obtained from our microarray experiments [4], was compared with sequences available in GeneDB ORTHOMCL4080 database (http://www.genedb.org/) to identify the nearest ortholog of this sequence in kinetoplastida. Multiple sequence alignments were performed using Clustal W version 1.8 (http://www.ebi.ac.uk/clustalw) and T-cofee [31]. To calculate evolutionary distances of kinetoplastida long-chain fatty acyl-CoA ligases with human acyl CoA synthetases (ACSs) [32], phylogenetic dendrograms were constructed by neighbor-joining method and tree topologies were evaluated by performing bootstrap analysis of 1000 data sets using MEGA 3.1 (Molecular Evolutionary Genetics Analysis) [33]. All 26 human ACSs amino acid sequences were selected [32], along with their transcript variants which are aligned with different long-chain fatty acyl-CoA ligase ortholog present in kinetoplastida family, to define the clade difference with Trypanosome and Leishmania long-chain fatty acyl-CoA ligase, and human acyl-CoA synthetases.

Homology Modeling of Leishmania Long-Chain Fatty
Acyl-CoA Ligase. The amino acid sequence of Leishmania long-chain fatty acyl-CoA ligase was retrieved from the NCBI database (GenBank Accession No. XM 001681734). It was ascertained that the 3D structure of Leishmania long-chain fatty acyl-CoA ligase protein was not available in Protein Data Bank (PDB); hence, the present exercise of developing the 3D model of this protein was undertaken. cBLAST (http://www.ncbi.nlm.nih.gov/Structure/cblast/cblast.cgi) and PSI-BLAST search was performed against PDB with the default parameter to find suitable templates for homology modeling. The sequence alignment of Leishmania long-chain fatty acyl-CoA ligase and respective templates was carried out using the CLUSTALW (http://www.ebi.ac.uk/clustalw) and MODELLER9V8 programs [34,35]. The sequences that showed the maximum identity with high score and lower e-value were used as a reference structure to build a 3D model. The retrieved sequences of Thermus thermophilus (PDB Accession Code: 1ULT, 1V25, 1V26) [36] and Archaeoglobus fulgidus (PDB Accession Code: 3G7S) long-chain fatty acyl-CoA ligases served as template for homology modeling based on its maximum sequence similarity to Leishmania longchain fatty acyl-CoA ligase. The alignment was manually refined at some loops region of the templates. The resulting alignment was used as an input for the automated comparative homology modeling for generating 3D model structure of Leishmania long-chain fatty acyl-CoA ligase. The academic version of MODELLER9V8 was used for model building. The backbones of core region of the protein were transferred directly from the corresponding coordinates of templates. Side chain conformation for backbone was generated automatically. Out of 50 models generated by MODELLER, the one with the best DOPE score, minimum MOF (Modeller Objective Function), and best VARIFY 3D profile was subjected to energy minimization. In order to assess the stereochemical qualities of 3D model, PROCHECK analysis [37] was performed and Ramachandran plot was drawn.

Metabolism of Long-Chain Fatty Acyl-CoA Ligase Enzyme.
Three types of fatty acyl CoA ligase have been defined with respect to the length of the aliphatic chain of the substrate: short (SC-EC 6.2.1.1), medium (MC-EC 6.2.1.2), and long-chain (LC-EC 6.2.1.3) fatty acyl-CoA ligase. These utilize C2-C4, C4-C12, and C12-C22 fatty acids as substrates, respectively [9]. Fatty acid activation step involves the linking of the carboxyl group of the fatty acid through an acyl bond to the phosphoryl group of AMP. Subsequently, a transfer of the fatty acyl group to the sulfhydryl group of CoA occurs, releasing AMP [38][39][40]. This magnesium-dependent twostep acylation of fatty acid by fatty acyl CoA synthetases was defined as unidirectional Bi Uni Uni Bi Ping-Pong mechanism [36,39].
Genome analysis suggests that L. major oxidizes fatty acids via β-oxidation in two separate cellular compartments: the glycosome and mitochondria [41]. An argument for the involvement of glycosome in lipid metabolism is the fact that in each of three trypanosomatid genomes three genes called half ABC transporters (GATI 1-3) have been found identical with peroxisomal transporters involved in fatty acid transport. In T. brucei, it was conformed that these transporters are associated with glycosomal membrane [42]. These transporters might be coupled with fatty acyl-CoA ligase in glycosome, which can provide activated form of fatty acids to these transporters like oleoyl-CoA, and also other acylated fatty acids.
In T. brucei, little β-oxidation was observed in mitochondria. However, T. brucei contains at least two enzymes involved in β-oxidation of fatty acid (2-enoyl-CoA hydratase and hydroxyacyl-dehydrogenase, encompassed in a single protein) with glycosome localization [43]. The presence of a PTS (Peroxisomal Targeting Sequence) on T. brucei and T. cruzi carnitine acetyl transferase, catalysing the last peroxisomal step in fatty acid oxidation, suggests that the major β-oxidation processes are situated in glycosomes [44]. In L. donovani, one of the β-oxidation enzyme 3-hydroxyacyl-CoA dehydrogenase has been localized to glycosomes [45]. The hypothetical localization of Leishmania long-chain fatty acyl-CoA ligase was predicted in mitochondria or glycosome but, with the reference of other organisms, the specialized localization of specific long-chain fatty acyl-CoA ligase family protein needs to be taken into account in future.
As mentioned in a previous study β-oxidation has been found to be unregulated in Leishmania's amastigotes then in promastigote stage [46][47][48]. This specialized increase was described so that, in infectious stage, energy requirement   was supplemented to utilize fatty acid as carbon and energy source rather than glucose [47]. Long-chain fatty acyl-CoA ligase is the key enzyme involved in β-oxidation of fatty acids, and its compartmentation in glycosome supports a strong evidence of the involvement of this enzyme in cellular biogenesis and its importance at particular stage of Leishmania life cycle. In the same way upregulation of long-chain fatty acyl-CoA ligase with combination of other enzymes involved in fatty acid catabolism might play a crucial role in cell survival at infectious stage of Leishmania, and these analyses must be supplemented with experimental biology.  (Figure 2). Southern hybridization was performed using the 2010 bp long-chain fatty acid-CoA ligase gene PCR product as probe (Figure 2(C)). The same blot was also probed with alpha tubulin gene probe as an internal control, showing equal loading (Figure 2(B)). Complete digestion resulted in a single copy within the L. donovani genome, as BamHI enzyme showed only one band of approximately 3848 bp, except PvuII which was cut once into the gene sequence and XhoI which was cut twice into the gene sequence, which  exhibited two and three hybridizing bands, respectively. The results showed that long-chain fatty acid-CoA ligase is present as a single copy gene in the L. donovani genome. The restriction pattern also verifying the restriction pattern of L. donovani and L. major long-chain fatty acyl CoA ligase coding region is almost the same. is less similar compared with other organisms and is likely to be critical in enzyme activity. Motif III was found to be in all acyl CoA synthetases and a part of A-motif (adenine motif). This region has been described as an ATP/AMPbinding domain in other acyl-CoA synthetases [49][50][51]. The conserved consensus sequence of A-motif is YGXTE, a highly conserved motif with respect to Leishmania long-chain fatty acyl-CoA ligase region, that is, YGFME. From the crystal structure of TtLC-FACS, it was proposed that Y324 was an adenine-binding residue [42] and also conserved throughout all organisms including Leishmania. The crystal structure of S. enterica acetyl-CoA synthetase revealed that the glutamate residue of A-motif is positioned near oxygen O1 of the AMP phosphate [52]. This region was predicted to be involved in substrate binding or stabilization, conserved in Leishmania long-chain fatty acyl-CoA ligase also. Motif IV comprises the first five residues of the nine-amino acid G-(or gate) motif (226-VPMFHVNAW-234) of ttLC-FACS (36), showing less sequence similarity with Leishmania long-chain fatty acyl-CoA ligase (281-CSWCVAGAL-289). From the crystal structure of TtLC-FACS, it was proposed that the indole ring of W234 acts as a gate and blocks the entry of fatty acids into its substrate binding tunnel unless ATP is first bound, resulting in a conformational change that swings the gate open (36). However, a tryptophan residue corresponding to W234 was not found in any Leishmania, human, yeast, and E. coli fatty acyl-CoA synthetase sequences. In contrast, although no highly conserved sequences were identified, a corresponding gate residue may be located elsewhere in the structure of Leishmania long-chain fatty acyl-CoA ligase. The fatty acyl-CoA synthetases are part of a large family of proteins referred to as the ATP-AMP-binding proteins. A common feature of enzymes in this family is that they all form an adenylated intermediate as part of their catalytic cycle. This group of enzymes is diverse in catalyzing the activation of a wide variety of carboxyl-containing substrates, including amino acids, fatty acids, and luciferin. Sequence comparison of members of the ATP-AMP-binding protein family has identified two highly conserved sequence elements, [53]
In fatty acyl-CoA synthetases family proteins, there was a third sequence element defined as FACS signature motif that was less conserved and partially overlaps the FACS signature motif, which is involved in both catalysis and specificity of the fatty acid substrate [54]. There are a number of notable features within the FACS signature motif: (i) this region contains two invariant glycine residues (at positions 2 and 7) and a highly conserved glycine at position 16, Leishmania long-chain fatty acyl-CoA ligase shares glycine residue with other FACSs at position 7 and 16 but Tyr instead of Gly was found in position 2. (ii) This region contains additional six residues that are invariant in the fatty acyl-CoA synthetases: W [3], T [6], D [8], D [22], R [23], and K [25], but in Leishmania long-chain fatty acyl-CoA ligase these residues are F [3], S [6], D [8], G [22], N [23], and D [25]. (iii) The residue in the fourth position is hydrophobic and is a leucine, a methionine, or phenylalanine. However, in Leishmania long-chain fatty acyl-CoA ligase hydrophobic residue valine was situated in position 4. (iv) This region of enzyme contains hydrophobic residues (leucine, isoleucine, or valine) at positions 4, 9, 18, 20, and 21. These residues, in addition to tryptophan or phenylalanine residues at position 3, may comprise part of a fatty-acid-binding pocket. All of these five conserved regions from FACS signature motif are having similarity among them except Leishmania long-chain fatty acyl-CoA ligase, with some variable regions. These less conserved regions in Leishmania long-chain fatty acyl-CoA ligase-FACS signature motif were predicted to adopt inconsistent specificity and catalytic activities of the fatty acid substrate compared to other fatty acyl CoA synthetases.

Phylogenetic Analysis of Leishmania Long-Chain Fatty Acyl-CoA Ligase and Human Acyl-CoA Synthetases Sequences.
We performed phylogenetic analysis to infer evolutionary relationships of all available sequences from kinetoplastida long-chain fatty acyl-CoA ligases (Table 1) and human (host) ACSs family sequences. This experiment was performed to validate that the parasite enzyme is unquestionably different from the human enzyme, and this aspect merits further study to validate this enzyme as a drug target. We obtained comparable results using the neighbor-joining distancebased algorithm as well as maximum parsimony. We found 9 clades, including kinetoplastida clade (one set of six kinetoplastida long-chain fatty acyl-CoA ligase protein family) forming a clade with high bootstrap support ( Figure 5). kinetoplastida clade was highly dissimilar and distinct from all ancestral nodes with other human ACSs family proteins and showing distinctiveness of kinetoplastida long-chain fatty acyl-CoA ligases, including Leishmania long-chain fatty acyl-CoA ligase. This divergence of Leishmania long-chain fatty acyl-CoA ligase with respect to the homologous human enzymes may be an important protein as a potential target candidate for chemotherapeutic antileishmanial drugs.

Homology Modeling of Leishmania Long-Chain Fatty
Acyl-CoA Ligase Protein. The backbone root-mean-squaredeviation (RMSD) values between final model and template crystal structure used are 1.04Å with Thermus thermophilus (PDB Accession Code: 1ULT, 1V25, 1V26) and 1.40Å with Archaeoglobus fulgidus (PDB Accession Code: 3G7S) longchain fatty acyl-CoA ligase. Small RMSD can be interpreted as structures share common structural homology and the generated structure is reasonable for structural similarity analysis ( Figure 6). The final modeled structure of Leishmania long-chain fatty acyl-CoA ligase was evaluated for overall quality using available analyses procedures. These analysis compare specific properties of the model with those of known high-quality protein structures using programs like PROCHECK, Verify3D, and WHATIF (Table 2). An important indicator of the stereochemical quality of the model is distribution of the main chain torsion angles phi and psi in Ramachandran plot (Figure 7). The plot clearly shows the vast majority of the amino acids in a phi-psi distribution consistent with right α-helices, and the remaining fall into beta configuration. Only three residues fall outside the allowed regions. Plots comparison shows that the structure is reasonable overall because the space distribution for the homology-modeled structure was similar to the Xray structure of the Thermus thermophilus long-chain fatty acyl-CoA ligase (PDB Accession Code: 1ULTA). The results showed that our modeled structure was reasonably good at that much less sequence identity.

Discussion
Earlier during the course of work, microarray analysis was performed on the same clinical L. donovani isolates (2001 and 39) in order to identify differential gene expression [4]. Out of all genes found differentially expressed, significant upregulation of long-chain fatty acyl-CoA ligase gene in SAG unresponsive clinical isolate [33] was found to be intracellular amastigote specific and has confirmed the involvement of long-chain fatty acyl-CoA ligase in resistance.
Similarly, it has been proven before that the rate limiting enzyme, long-chain fatty acyl-CoA ligase of β-oxidation, was found to be upregulated in amastigotes derived from cloned line of L. donovani ISR because, during late stages of differentiation, the parasites shift from glucose to fatty acid oxidation as the main source of energy, and thereby there is increase in enzyme activity associated with β-oxidation capacity [47,48]. Early in vivo studies showed that enzymatic activities associated with β-oxidation of fatty acids were significantly higher in L. mexicana amastigotes [47]. Additionally microarray experiments with intracellular amastigotes hybridized onto Affymetrix Mouse430 2 GeneChips showed that several genes involved in fatty acid biosynthesis pathway were found to be upregulated [55]. Presently studies are ongoing in our laboratory on microarray analysis using intracellular amastigotes hybridized to Affymetrix GeneChip human genome U133 Plus 2.0 array which will further yield useful information towards the fatty acid/lipid metabolism within this clinical isolate. A very recent study by Yao et al., 2010, on differential expression of plasma membrane proteins in logarithmic versus metacyclic promastigotes of L. chagasi has also identified long-chain fatty acyl-CoA synthetase [56]. As mentioned before, long-chain fatty acid-CoA ligase is present in both prokaryotes and eukaryotes. This divergence of Leishmania long-chain fatty acyl-CoA ligase with respect to the homologous human enzymes may be an important protein as a potential target candidate for chemotherapeutic antileishmanial drugs. Many differences exist between host and parasite pertaining to the structure and arrangement of this enzyme. However, Leishmania has significant divergence and adaptation to specific environmental conditions between its two life stages, in the insect vector and human host. This can affect the parasites metabolic machinery in terms of presence of certain pathways, their subcellular localization and expression at different developmental stages, and the interplay between scavenging and synthesis of key metabolites. It has been argued previously [57] that successful targets for metabolic intervention are most likely to be found among enzymes exerting strong control of flux through metabolic pathways. These control points are likely to be species and development dependent. Even if a unique or highly divergent enzymatic process is found in the parasites, this does not necessarily mean it can be developed as a target for useful inhibitors. On the other hand, enzymes that are present in both the parasites and their animal hosts will often differ sufficiently in their sequence for inhibitors to be specific. Finally, even orthologous enzymes functioning in the same pathway and in the same subcellular compartment of the parasites may have different inhibitor binding properties, leading to variability in the effectiveness and specificity of inhibitors targeting any particular enzyme.
The detection of the long-chain fatty acid-CoA ligase gene in the genome of L. donovani clinical isolate, in the present study, deserves a full exploration with respect to its potential as a drug target. Changes in membrane lipids/deficiency of certain fatty acids and disease association have been documented [34,58]. Modulation of enzymes involved in lipid synthesis and of others possibly involved in cell wall metabolism may modify access of drug to the plasma membrane. Moreover, our microarray experiment indicated that this enzyme was amastigote specific making it all the more important to study it further and test if it can be exploited as a validated drug target. We have also shown earlier in our laboratory [34] that modification of lipid composition on the plasma membrane of the parasite might have important implications towards generating susceptibility/resistance to antileishmanial drugs. As this enzyme stipulates several important cellular processes in Leishmania like stage-specific expression [47,48], host-parasite interaction [55], cell membrane composition [17,18], phospholipid biosynthesis [16,21], and drug resistance [4], the present study proposed further evaluation of Leishmania long-chain fatty acyl-CoA ligase as a candidate drug target.