In Silico Characterization of an Important Metacyclogenesis Marker in Leishmania donovani, HASPB1, as a Potential Vaccine Candidate

Visceral leishmaniasis is a life-threatening infectious disease worldwide. Extensive experiments have been done to introduce potential vaccine candidates to combat leishmaniasis. The present study was done to evaluate Leishmania donovani hydrophilic acylated surface protein B1 as a potential vaccine candidate using in silico methods. For this aim, server-based predictions were performed regarding physicochemical characteristics, solubility, antigenicity, allergenicity, signal peptide, transmembrane domain, and posttranslational modifications (PTMs). Also, secondary and tertiary structures were predicted using NetSurfP-3.0 and I-TASSER, respectively. The 3D model was further subjected to refinement and validation, and promising B-cell, cytotoxic T-lymphocyte (CTL; human, dog), and helper T-lymphocyte (HTL; human) epitopes were predicted. The protein had a molecular weight of 42.19 kDa, with high solubility (0.749), stability (instability index: 21.34), and hydrophilicity (GRAVY: -2.322). No signal peptide or transmembrane domain was predicted, and the most abundant PTMs were phosphorylation, O-glycosylation, and acetylation. Many coils and disordered regions existed in the secondary structure analysis, and the tertiary model had a good confidence score (-0.79). Next, the ProSA-web and PROCHECK tools showed adequate improvements in the refined model compared to the crude model. Only 4 shared B-cell epitopes among three web servers (ABCpred, BepiPred 2.0, and SVMTriP) were shown to be antigenic, nonallergenic, and with good water solubility. Also, five potent CTL epitopes in dogs and five in humans were predicted. Notably, two HTL epitopes were found to be potential IFN-γ inducers. In conclusion, our results demonstrated several immunogenic epitopes in this protein, which could be directed towards multiepitope vaccine design.


Introduction
Leishmaniases are a group of vector-borne diseases that threaten about 350 million people in 98 countries and have been classified as one of the six major neglected tropical diseases (NTDs) worldwide [1]. The protozoan parasites of the genus Leishmania (family Trypanosomatidae) are known as the causative agents, being transmitted through biting by female phlebotomine sandflies, i.e., Phlebotomus and Lutzomyia, in the Old World and New World, respectively [2]. The Leishmania parasites are obligatory intracellular organisms with two common interchangeable forms: nonflagellated amastigotes within particular immune cells (macrophages and dendritic cells) and flagellated promastigotes within the sandfly gut. This parasitic infection manifests in three important clinical forms, including cutaneous leishmaniasis (CL), mucocutaneous leishmaniasis (MCL), and visceral leishmaniasis (VL) [3]. In many areas, CL is recognized as the most widespread type of the infection, with a particular impact on 0.7-1.2 million individuals, mostly in the Americas, Central Asia, the Middle East, and the Mediterranean Basin [4], whereas VL (so-called as kala-azar) is the most severe form with an incidence rate of 50,000 to 90,000 individuals annually, particularly in Brazil, eastern Africa, and the Indian subcontinent. Leishmania donovani (L. donovani) and L. infantum are known as the principal agents of VL [4,5].
The VL control strategies have remained unsatisfactory in different endemic areas, possibly due to inadequate vector (sandflies) and/or reservoir (mostly canids) control, along with limited treatment options [6]. First-line therapies such as pentavalent antimonials and amphotericin B are costly and may incur severe toxic effects, including cardiotoxicity, nephrotoxicity, and hepatotoxicity. These compounds are needed to be administered at long-term intervals and may, also, be well tolerated by drug-resistant parasites, causing treatment failure [7,8]. On this premise, vaccine development seems to be a safer option to effectively control VL infections in endemic areas.
With the advances in genome sequencing technologies and computer sciences, various biomedical databases and computational methods were developed, which increased our knowledge of host-pathogen interactions at the molecular level; this can be advantageous to vaccinology related studies against infectious zoonotic diseases such as VL [9]. In other words, such information enables us to detect, organize, and generate novel antigenic proteins and promiscuous B-and T-cell epitopes in order to devise rational next-generation vaccine candidates in a cost-and time-effective manner [10]. In this sense, several multicomponent vaccine candidates such as the putative Q protein, Leish110-f, Leish111-f, and KSAC have shown protective immune responses [11][12][13]. A potent vaccine candidate against Leishmania would be capable of strong stimulation of IFN-γ-producing helper type-I T-cells (Th1) via antigen-presenting cells (APCs), resulting in macrophage activation and a subsequent upsurge in nitric oxide (NO) and reactive oxygen species (ROS) to encounter infective amastigotes [14]. Previously, several Leishmania antigens have been introduced and used in vaccination studies [15]. The hydrophilic acylated surface proteins (HASPs) are encoded by chromosome 23, originally called the LmcDNA16 locus [16], in all Leishmania species [17]. Among these, HASPB has been in focus and is expressed on the plasma membrane of metacyclic promastigotes as well as amastigotes [18]. It has been shown to be highly immunogenic, demonstrating durable immunity in canine models of L. donovani infections [19]. The present study was done to characterize some of the physicochemical and structural properties along with immunogenic epitopes of the L. donovani HASPB1 (LdHASPB1) using several immunoinformatic approaches.

Methods
2.1. Amino Acid Sequence Retrieval. The amino acid sequence of the LdHASPB1 protein was retrieved in an FASTA format through a leading high-quality, comprehensive, and freely accessible resource of protein sequences and functional information, UniProt Knowledge Base [20], available at https:// www.uniprot.org/,under accession number O77301_LEIDO.

Forecasting
Basic Antigenic, Allergenic, Solubility, and Physicochemical Characteristics of LdHASPB1. Some of the preliminary physicochemical properties of the protein were predicted using the ExPASy ProtParam web tool (https:// web.expasy.org/protparam/) [21]. The server performs a prediction to evaluate the instability index, aliphatic index, grand average of hydropathicity (GRAVY), estimated half-life, amino acid composition, theoretical isoelectric point (pI), and molecular weight (MW). The protein solubility was evaluated using the Protein-Sol web tool, developed by the University of Manchester (https://protein-sol.manchester.ac.uk/). "The server provides a fast, easy-to-use, sequence-based method for predicting protein solubility based on the population average for the experimental Escherichia coli (E. coli) dataset," and values above 0.45 are good soluble proteins [22]. The antigenicity of the LdHASPB1 protein was demonstrated by using the VaxiJen v2.0 web server, available at http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html, which performs an alignment-independent prediction of protective antigens with 70-89% accuracy [23]. Regarding antigenicity prediction by the VaxiJen v2.0 server, "parasite" was selected as the target organism, and the threshold of prediction was set at 0.45. Finally, the allergenicity of the protein was determined using a hybrid approach of the AlgPred v2.0 online server (https://webs.iiitd.edu.in/raghava/algpred2/), employing random forest (RF), basic local alignment search tool (BLAST), and Multifocal Electroretinogram Classification Interface (MERCI) machine learning techniques [24].

Secondary and Tertiary Structure Predictions.
The structural analysis of the protein was initially done using secondary structure prediction by the NetSurfP-3.0 server (https://services.healthtech.dtu.dk/service.php?NetSurfP-3.0). This server predicts the surface accessibility, secondary structure, disordered regions, and phi/psi dihedral angles of residues in a particular amino acid sequence [32]. Subsequently, a fully automated protein homology modelling tool, Iterative Threading ASSEmbly Refinement (I-TASSER), was used to 2 BioMed Research International predict the top-five three-dimensional (3D) models of the protein using derived structural templates using the multiple threading approaches of the local metathreading server, LOMETS [33]. The I-TASSER server is accessible at https:// zhanggroup.org/I-TASSER/. The validity of each model relies on a confidence score (C-score) ranging between -5 and 2, where a higher C-score usually represents a more confidently predicted model [33].
2.5. Refinement and Validation of the 3D Model. The bestpredicted 3D model (highest C-score) was further subjected to the GalaxyRefine web server for relaxation and energy minimization in the final model. The CASP10 refinement method is employed by this server for side chain reestablishment and repacking by molecular dynamic simulation at the whole protein structure level [34]. Next, the refined 3D model of the LdHASPB1 protein was submitted to a number of web tools for validation, including ProSA-web and PROCHECK. Based on the ProSA-web server, a Z-score is assigned to each input structure that is comparable to that of naïve proteins with the same conformation and size; the Z-score is defined as "energy separation between the native fold and the average of an ensemble of misfolds" [35]. The PROCHECK web tool evaluates the stereochemical quality of a protein structure through a residue-by-residue geometry analysis and illustrates the phi-psi torsion angles for each amino acid in allowed and disallowed regions, known as the Ramachandran plots [36].

Prediction and Screening of Helper T-Lymphocyte (HTL) and
Cytotoxic T-Lymphocyte (CTL) Epitopes. Those epitopes with specific affinity to the class II major histocompatibility complex molecules (MHC-II), which are known as HTL epitopes, were predicted using the MHC-II epitope prediction tool of the IEDB web server by the selection of the recom-mended prediction method, "Human" as the target host, and the "HLA reference set alleles" option (population coverage over 97%) [41]. The top 10 epitopes with lower percentile ranks (higher binding affinity) were further screened regarding antigenicity, allergenicity, and interferon-gamma (IFN-γ) induction by using the VaxiJen v2.0, AllerTOP v2.0, and IFNepitope (http://crdd.osdd.net/raghava/ifnepitope/) online tools, respectively. The latter employs a dataset of MHC-II-binding IFN-γ-inducers and noninducers, and the most accuracy can be reached by selecting a hybrid model (>81.39%) [42], as we did here. The hybrid approach is a combination of motif-based and machine learning-based approaches; based on Dhanda et al. [42] [42], "First of all, the sequences were separated that could be correctly predicted via motif-based approach and the remaining sequences were then predicted using SVM. Finally, the performance was evaluated by adding the truly predicted peptides from the motifbased method with SVM-based predictions." Those 9-10-mer CTL epitopes (MHC-I binders) specific to humans were predicted using the IEDB MHC-I epitope prediction tool, available, using the IEDB recommended method 2020.09 (NetMHCpan EL 4.1) [41]. This prediction was done with the selection of a reference HLA allele set, including 16 [43]. Those epitopes with a higher binding affinity (percentile rank < 1) [41] were screened in terms of antigenicity (VaxiJen v2.0) and allergenicity (AllerTOP v2.0). Moreover, high-affinity epitopes for dog leukocyte antigen (DLA) class-I molecules (i.e., DLA-8803401, DLA-8850101, and DLA-8850801) were predicted using the abovementioned tool in IEDB, with subsequent antigenicity (VaxiJen v2.0) and allergenicity (AllerTOP v2.0) screening.

Antigenicity, Allergenicity, Solubility, and Physicochemical
Profiles of LdHASPB1. The VaxiJen antigenicity score for the LdHASPB1 protein was calculated to be 1.4409, rendering it a highly antigenic molecule. Based on the AlgPred server output, this protein possesses no allergenic traits, whereas it was shown to be highly soluble, according to the 0.749 solubility score predicted by the Protein-Sol online tool. The ExPASy ProtParam tool provided a number of important physicochemical characteristics for the examined protein; the output of this server demonstrated that the protein possesses 401 amino acid residues in length, with an MW of 42.19 kDa and a pI of 4.64. Moreover, there were about two-fold more negatively charged residues (Asp + Glu) in the sequence (n = 106) than positively charged ones (n = 53). The half-life of LdHASPB1 in mammalian reticulocytes was over 30 hours, and the protein was demonstrated to be low thermotolerant (aliphatic index: 5.89), stable (instability index: 21.34), and an extremely hydrophilic molecule (GRAVY: -2.322) ( Table 1).

Forecasting
Signal Peptide, Transmembrane Domain, Subcellular Localization, and PTM Sites. No signal peptide 3 BioMed Research International or transmembrane domain was detected in the LdHASPB1 protein sequence, based on the SignalP and DeepTMHMM web servers, respectively. Additionally, the DeepLoc tool revealed that the protein is probably a cell membrane component (likelihood: 0.63). N-Glycosylation and palmitoylation sites were rarely predicted in the protein sequence, whereas lysine acetylation, O-glycosylation, and phosphorylation sites were abundantly predicted in the LdHASPB1, with 11, 16, and 28 regions, respectively. The detailed properties of the PTM sites are provided in Table 1. 3.3. Secondary Structure Analysis. The residues were exposed in most parts of the sequence, according to the NeteSurfP-3.0 server, with frequent coil regions. Also, there observed a high probability of disordered regions throughout the protein sequence (Table 1 and Figure 1).    85  90  95  100  105  110  115  120  125  130  135  140  145  150  155  160   165  170  175  180  185  190  195  200  205  210  215  220  225  230  235  240   245  250  255  260  265  270  275  280  285  290  295  300  305  310  315  320   325  330  335  340  345  350  355  360  365  370  375  380  385  390  395     In this study, three different methods of prediction were used for linear B-cell epitopes using the ABCpred, BepiPred-2.0, and SVMTriP web tools, with strict thresholds. The output of each server was compared with that of two others; shared epitopes were extracted and screened. On this basis, among the 8 common linear B-cell epitopes, 4 were shown to possess good antigenicity, no allergenicity, and with good water solubility, based on the VaxiJen, AllerTOP, and PepCalc servers, respectively. These epitopes were as follows: "TQKNDGDG," "KEDGHTQK," "AQEKNEDGHNVGD," and "GDGPKE-GENLQ" (Table 2). Also, three conformational B-cell epitopes were predicted for the LdHASPB1 protein using the ElliPro tool, as follows: (i) 14 residues, score: 0.802; (ii) 107 residues, score: 0.695; and (iii) 105 residues, score: 0.679. More details are illustrated in Figure 5.

Discussion
Vaccination appears to be the best mainstay in controlling Leishmania-induced infections such as kala-azar [44]. An efficacious vaccine candidate against leishmaniasis is antici-pated to elicit functional and protective immune responses by igniting the leishmanicidal properties of macrophages, hence preventing the increase in parasite load and subsequent pathological immune imbalance [45]. So far, different vaccination strategies have been used to design vaccines that prevent human and/or canine leishmaniasis [5]. Subunit vaccines are of particular interest among others since they carry immunogenic components of a given pathogenic organism. Thus, they are safer than killed/attenuated vaccines and induce more specific and targeted immune stimulation [46]. In this sense, immunoinformatic-based web servers and online tools can assist us in discovering novel vaccine targets in a considerable amount of genomic and proteomic data, facilitating rational vaccine design [47]. As mentioned before, Leishmania HASPs possess a stage-regulated expression pattern, only being confined to metacyclic promastigotes and amastigotes [18]. They are highly immunogenic proteins so that sera of VL-and CL-infected can efficiently detect recombinant HASPB (rHASPB) protein [48,49]. HASPB is, also, a metacyclogenesis marker in the sandfly vector [17]. The ubiquity of HASPs in all tested Leishmania species is beneficial for producing a general leishmaniasis vaccine [50,51]; hence, it deserves further exploration through a set of in silico methods. In the current study, in silico characterization and prediction of B-and T-cell epitopes of the LdHASPB1 protein, as a potential vaccine candidate, were performed using immunoinformatic web servers.
In the first step, the biochemical characteristics of LdHASPB1 were evaluated using a set of bioinformatics web servers. This 401-residue protein had an MW of about 42 kDa, and most of its residues were negatively charged (Asp + Glu). Reportedly, charged residues in a protein sequence play a significant role in protein orientation/position [52], and abundant negatively charged ones preferentially occur at the noncytoplasmic flank [53]. In this study, the pI of the LdHASPB1 protein was estimated to be 4.64. A pI is a charge at which the pH turns zero so that in pH ranges above and below the pI, a given protein would be negatively charged and positively charged, respectively [54]. The protein instability index (21.34) showed that the protein is stable in an experimental test tube. Moreover, a GRAVY score of -2.322 and an aliphatic index of 5.89 showed that the LdHASPB1 protein is a fully hydrophilic and weak thermotolerant molecule, respectively. The relative volume of a protein occupied 7 BioMed Research International by its aliphatic side chains (alanine, valine, leucine, and isoleucine) is termed as an aliphatic index, enabling the protein to be thermostable in a wide range of temperatures [55]. Although a low aliphatic index was predicted for this protein, the main focus of this study was on the extensive epitope mapping of LdHASPB1, which can be used towards multiepitope vaccine construction against VL cases in humans and dogs. The GRAVY score is an estimated mean of hydrophilicity and hydrophobicity values for individual residues, so scores over or below zero indicate hydrophobicity and hydrophilicity, respectively [56]. In addition, this protein was shown to be highly antigenic and nonallergenic with high solubility. Understanding such preexperimental chemical and biophysical properties is necessary for future wet-lab experiments. Of   (16) and lysine acetylation regions (11). It is said that these PTM sites are decisive in recombinant protein production, so eukaryotic expression systems (yeast, insect, or mammalian) are more preferred than bacterial hosts to produce those proteins having different PTM sites [57]. Since the protein was predicted to be destined for the cell membrane using the  DeepLoc server, neither a signal peptide nor a transmembrane domain existed, according to the SignalP-6.0 and DeepTMHMM servers, respectively.
Using the NetSurfP-2.0 server, surface accessibility, secondary structure, and disordered regions were predicted in the submitted protein sequence. The output showed that almost all regions of the protein were structurally disordered and surface accessible in nature. Disordered proteins are highly abundant and mostly dedicated to regulatory functions and molecular signaling. Supposedly, these regions are likely immunological targets for antibodies; hence, they seem to be important in vaccination studies [58]. Also, exposed surfaces in a protein facilitate the process of epitope mapping by specific antibodies [59]. Random coils were the only secondary structure predicted and have been considered as randomly oriented polymer conformation bonded to nearby units [60]. In general, the protein conformation is maintained and protected during molecular interactions using internally located structures such as coils. Based on the I-TASSER server, pair-wise structure similarity reported five models, among which the first model with the highest C -score (-0.79) was selected with a TM-score of 0:61 ± 0:14 and an estimated RMSD of 8:6 ± 4:5 Å. The 3D model was further subjected to refinement and validation. According to the ProSA-Web and PROCHECK analyses, the quality of the refined model was enhanced after refinement as compared with the crude model.
Acquired immune responses play a major role in the prevention and/or control of Leishmania-induced infection in susceptible hosts. During the metacyclic phase, the HASPB1 protein can be exposed to the plasma membrane surface, facilitating detection by specific antibodies [18]. On this basis, we predicted linear and conformational B-cell epitopes for the LdHASPB1 protein. A multistep approach was conducted to screen shared linear B-cell epitopes using six web servers; three were used for the identification of shared linear B-cell epitopes (BepiPred-2.0, ABCpred, and SVMTriP), and three were exploited for screening in terms of antigenicity, allergenicity, and water solubility (VaxiJen, AllerTOP, and Pep-Calc). The final output showed four potentially antigenic, nonallergenic epitopes having good water solubility, comprising "TQKNDGDG" (antigenicity score: 1.5354), "KEDGHTQK" (antigenicity score: 1.2394), "AQEKNEDGHNVGD" (antigenicity score: 1.1338), and "GDGPKEGENLQ" (antigenicity score: 1.2153). Moreover, three conformational B-cell epitopes with populated residues were predicted for this protein using the ElliPro tool, which is involved in antigen-antibody interactions.
Given the intracellular nature of Leishmania parasites, helper type-1 CD 4 + (Th1) and cytotoxic CD 8 + T-cells (CTLs) are key regulators in controlling leishmaniasis. Moreover, the capability of IFN-γ induction is a pivotal function for Th1-type epitopes, resulting in the activation of macrophages and downstream parasite clearance mechanisms [61]. It has been shown that rHASPB can induce protective immunity against L. donovani infection via direct and/or indirect interleukin-12 (IL-12) production and subsequent CD 8 + -dependent IFN-γ induction [62]. Moreover, several viral vector-based (adenovirus and lentivirus) fusion protein vaccines have, also, demonstrated significant humoral and cellular (IFN-γ and IL-4) immune responses against L. major [63], L. donovani [64], and L. infantum [19]. On this basis, further attention should be paid to the epitope analysis of LdHASPB1. In the current study, specific human HTL and CTL epitopes along with the dog CTL epitopes were predicted and screened in terms of antigenicity, allergenicity, and IFN-γ induction. Of note, predicted HTL epitopes were mostly located at positions 23-39. Potent human IFN-γ inducing HTL epitopes predicted by the IEDB server was in association with the HLA-DQA1 * 05 : 01/DQB1 * 03 : 01, as one of the prevalent HLA alleles, including "EANHGGAT GVPPKHT" (antigenicity score: 0.9030) and "TEANHG-GATGVPPKHT" (antigenicity score: 1.0112). Among human CTL epitopes, five were selected as potentially antigenic and nonallergenic ones, based on the IEDB HLA reference set covering over 97% of the population, enclosing "EANHGGAT GV" (HLA-A * 68 : 02), "EPQKRADNI" (HLA-B * 51 : 01), "SAKEPQKRA" (HLA-A * 30 : 01), "APKEDGHTQ" (HLA-B * 07 : 02), and "EPQKRADNI" (HLA-B * 08 : 01). Since dogs are important reservoirs of L. donovani in the Old World countries, CTL epitope analysis for LdHASPB1 was, also, done regarding DLA, using the IEDB server. Our results suggested five highranked antigenic and nonallergenic epitopes regarding DLA class-I molecules, such as "KDSAKEPQKR" (DLA-8803401), "EANHGGATGV," and "KTTEANHGGA" (DLA-8850101), as well as "KDSAKEPQKR" and "TQKNDGDGPK" (DLA-8850801). Altogether, the clinical implications of these specific B-and T-cell epitopes can be assigned to design and engineer different and novel multiepitope vaccine constructs using the predicted epitopes and those of other highly immunogenic Leishmania proteins, along with a Th1-biasing adjuvant such as the RS-09 synthetic protein (toll-like receptor 4 agonist) for enhanced immunogenicity. The major challenge in designing such vaccine candidates may be their in vivo safety and reliability, which need to be further evaluated using wet experimental methods against human or canine challenges with VL.

Conclusion
Due to the importance of VL in tropical and subtropical regions and its zoonotic aspects, preventive measures such as vaccination seem to be more effective than therapeutic approaches. Next-generation vaccine design using strictly screened, highly antigenic epitopic fragments of known L. donovani antigens in the context of unprecedented immunization platforms provides novel insights into the vaccination against kala-azar. In the present study, the most functional and important biophysical properties and novel B-and Tcell-binding epitopes were predicted in the LdHASPB1 protein of L. donovani using a set of immunoinformatic servers. Notably, several CTL epitopes were predicted for human HLA reference alleles and three DLA class-I alleles, which could be further allocated in vaccination studies against VL, as alone or combined with other epitopes/antigens, in the context of a multiepitope vaccine. As a final word, the information provided here, particularly the immunogenic CTL, HTL, and B-cell epitopes, can be of interest to vaccinology researchers and may give insight for designing novel vaccines against VL. 10 BioMed Research International

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there are no conflicts of interest.