Multi-epitope vaccine candidates based on mycobacterial membrane protein large (MmpL) proteins against Mycobacterium ulcerans

Buruli ulcer (BU) is a neglected tropical disease. It is caused by the bacterium Mycobacterium ulcerans and is characterized by skin lesions. Several studies were performed testing the Bacillus Calmette-Guérin (BCG) vaccine in human and animal models and M. ulcerans-specific vaccines in animal models. However, there are currently no clinically accepted vaccines to prevent M. ulcerans infection. The aim of this study was to identify T-cell and B-cell epitopes from the mycobacterial membrane protein large (MmpL) proteins of M. ulcerans. These epitopes were analysed for properties including antigenicity, immunogenicity, non-allergenicity, non-toxicity, population coverage and the potential to induce cytokines. The final 8 CD8+, 12 CD4+ T-cell and 5 B-cell epitopes were antigenic, non-allergenic and non-toxic. The estimated global population coverage of the CD8+ and CD4+ epitopes was 97.71%. These epitopes were used to construct five multi-epitope vaccine constructs with different adjuvants and linker combinations. The constructs underwent further structural analyses and refinement. The constructs were then docked with Toll-like receptors. Three of the successfully docked complexes were structurally analysed. Two of the docked complexes successfully underwent molecular dynamics simulations (MDS) and post-MDS analysis. The complexes generated were found to be stable. However, experimental validation of the complexes is required.

faster spread and mutation of diseases resulting in a need for more rapid vaccine development.However, this is not to say that the complete fast-tracking of vaccine development is to be encouraged, as this may result in ineffective vaccines or the development of unexpected side effects [6].Another challenge to vaccine development is the accurate reproduction of native antigens and an adequately strong immune response [5].This, in turn, requires an in-depth understanding of the mechanisms behind a protective immune response against the target pathogen, without which there may be possible negative side effects or poor clinical results [7].The advancement of various technologies in the immunology, microbiology, genetics and structural biology fields has led to the development of new approaches to vaccine development, such as bioinformatics, systems biology and high-throughput screening methods [7,8].The use of such high-throughput screening technologies may accelerate the pre-clinical phase of vaccine development [9].
Various new technologies used in vaccine research include genomics, transcriptomics, proteomics, metabolomics and immunomics [10].These 'vaccinomics' can be used to identify promising candidates from the target organism genome using in silico bioinformatics [10].One such approach is reverse vaccinology (RV), which allows for the selection of suitable antigens prior to experimental testing [11,12].This approach overcomes the limitations of traditional vaccines in terms of reducing antigenic variability and molecular mimicry [9].It also allows for the development of vaccine candidates that may be effective against a range of strains of a pathogen [13].For the development of antimicrobial vaccines, it reduces the need for bacterial culturing at the screening stage of analysis [13].This approach may also reduce the cost of the initial stages of vaccine development [9].However, this approach remains highly dependent on experimental studies, especially when examining the various aspects of the identified antigens, such as their functions, properties, structures, effect on virulence, and if they interact with the host [12].This approach is attractive as an initial design step, especially for diseases for which there are no vaccines or neglected diseases.
The causative agent of the neglected tropical disease Buruli ulcer (BU) is Mycobacterium ulcerans [14].Symptoms appear on the skin and subcutaneous tissue, with the formation of nodules, plaques, oedemas and skin ulcerations [15].However, treatment may be delayed due to the often painless progression of the lesions [15,16].This delay may result in disability surrounding joint movements or severe deformity [17].Previous treatment consisted of the removal of the lesions through surgery; however, it has now developed primarily to the use of antibiotics, with surgery being used to remove necrotic tissue [18].The current antibiotic regimen recommended by the World Health Organization (WHO) [14] is the use of 10 mg kg −1 of rifampicin once daily combined with 7.5 mg kg −1 clarithromycin twice daily.Early diagnosis and treatment are critical in minimizing the symptoms of this disease [18].However, BU often occurs in resource-limited communities that often have limited access to healthcare [19].Furthermore, the lack of complete knowledge regarding the route of transmission or vectors continues to hinder prevention and control strategies [20].
Presently, there are no clinically accepted BU vaccines available.The Bacillus Calmette-Guérin (BCG) vaccine used against tuberculosis is the only vaccine tested in humans against BU [21,22].However, it has been concluded that current BCG vaccines cannot fully protect against BU in humans [23].Additional studies using M. ulcerans-specific vaccines have been carried out in various animal models; however, long-term protection was not observed [24][25][26][27][28][29][30][31][32][33][34][35][36].The RV approach has been applied in two studies at present [37,38].These studies successfully identified T-cell and B-cell epitopes from the proteomes of M. ulcerans.The first study focused on chromosome-encoded virulence proteins [37], while the second study analysed proline-glutamate polymorphic GC-rich sequence (PE-PGRS) family proteins [38].Our previous study focused on identifying epitopes from the major facilitator superfamily (MFS) transporter protein of M. ulcerans [39].The identified epitopes [37] and vaccine candidates constructed by [38,39] displayed several desirable properties.However, several proteins within the M. ulcerans proteome remain to be studied.
One such set of proteins is the mycobacterial membrane protein large (MmpL) proteins.These transporters are a subclass of the resistance-nodulation-cell division (RND) superfamily [40].The MmpL proteins play a critical role in the establishment of the mycobacterial cell envelope through the translocation of complex envelope lipids, which are associated with virulence, and siderophores through the plasma membrane to the periplasmic space [41].The importance of their role highlights their potential as vaccine candidates.MmpL transporters can be identified as potential important virulence factors [40].Identifying proteins and resulting antigens with an outer membrane topology is advantageous, as these are more accessible to the host's immune system [37].The aim of this study is to identify CD8 + and CD4 + T-cell and B-cell epitopes with desirable properties from the MmpL proteins of M. ulcerans.Once screened, these epitopes are used to construct vaccine candidate sequences and docked with Toll-like receptors (TLRs).These constructs and complexes undergo analysis to determine their capability to produce a protective immune response against M. ulcerans infection.

Identification of human leucocyte antigen alleles for T-cell epitope prediction
In order to determine the human leucocyte antigen (HLA) alleles to be selected for analysis, the Allele Frequency Net Database (http://www.allelefrequencies.net/default.asp)[46] was used to identify HLA alleles within endemic countries.These alleles were filtered based on the allele frequency, in order of highest to lowest, and the population standard was set to gold only (electronic supplementary material, table S2).The top 10 HLA allele types for each endemic country were selected and used to generate the respective CD8 + and CD4 + epitope peptides, based on the availability of the HLA alleles on the respective websites (electronic supplementary material, table S3) [46][47][48].

Prediction of CD8 + T-cell epitopes
The identified potential outer membrane proteins were extracted and submitted to the NetMHCpan v 4.1 website (https://services.healthtech.dtu.dk/service.php?NetMHCpan-4.1),with the peptide length set to 9 [47].Using the chosen HLA alleles (electronic supplementary material, table S3), peptides capable of binding to MHC I were generated.Only peptides with IC 50 values ≤ 250 nM were extracted and submitted to VaxiJen v 2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) for antigenicity analysis [49].The target organism was set to bacteria, with a threshold of 0.5.Peptides with an antigenicity value ≥ 0.5 were extracted.The antigenic nanomers were then analysed using the Class I Immunogenicity Web server (http://tools.iedb.org/immunogenicity/)[50], following which peptides with positive scores were selected.

Prediction of CD4 + T-cell epitopes
The identified potential outer membrane proteins were also submitted to NetMHCIIpan v 4.0 (https://services.healthtech.dtu.dk/service.php?NetMHCIIpan-4.0)to predict which peptides would be capable of binding to MHC II molecules of known sequences [48].The identified HLA alleles were selected (electronic supplementary material, table S3) and the peptide length was set to 15. Upon completion of the analysis, only peptides with IC 50 values ≤ 250 nM were extracted.These peptides underwent antigenicity analysis using VaxiJen v 2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [49].The target organism selected was bacteria, and the threshold was set to 0.5.Only antigenic epitopes (≥ 0.5) were selected.

Population coverage analysis
The conserved T-cell epitopes were submitted to the Population Coverage Web server (http://tools.iedb.org/population/) [57].The 8 CD8 + and 12 CD4 + T-cell epitopes and their respective HLA alleles were inputted (electronic supplementary material, table S4).The endemic regions were chosen per the WHO database [58].The following regions were selected: World, Australia, Cameroon, Central African Republic, Congo, Côte d'Ivoire, Ghana, Gabon, Japan, Liberia, Nigeria, Papua New Guinea, Sudan and West Africa.The population coverages for class I, class II and class combined were estimated.

Physico-chemical analysis of the vaccine candidate sequences
The candidate sequences were submitted to the ProtParam tool (https://web.expasy.org/protparam/)[88] for analysis of their in silico physical and chemical properties.These properties consist of molecular weight, amino acid composition, isoelectric point, instability index, aliphatic index, the grand average of hydropathicity (GRAVY) and the estimated half-life of the protein in mammalian cells, yeast cells, and Escherichia coli [88].The models were then analysed using the SCooP v 1.0 website (http://babylone.ulb.ac.be/SCooP/index.php)[89][90][91].Various thermodynamic properties of the structures, such as the melting temperature (T m ), change in enthalpy (ΔH m ), change in specific heat upon folding (ΔC p ), and the folding free energy at room temperature (ΔG r ), are determined by this website [89][90][91].

In silico codon adaptation, vaccine optimization and expression
The JAVA Codon Adaption Tool (JCat) (http://www.jcat.de/)[92] was used to perform codon adaptation and vaccine optimization.Escherichia coli (strain K12) was selected as the model organism.This organism was selected due to the dissimilarity between its codon usage and humans, which may result in higher expression [93].The restriction sites HindIII (5'AAGCTT'3) and BamHI (5'GGATCC'3) were added to the N-and C-terminals of all the optimized DNA sequences, respectively.The insertion of the candidate sequences into the pET-28a(+) vector was carried out using SnapGene v.6.0.2 software (from Insightful Science; available at https://www.snapgene.com/).

Immune simulations
C-IMMSIM (https://kraken.iac.rm.cnr.it/C-IMMSIM/index.php?page=1) was used to simulate potential immune responses to MEV1, MEV2 and MEV5 [102,103].The parameters were left as the default settings, with a time step of 1 and a single injection with no lipopolysaccharide (LPS).

Molecular dynamics simulations
Molecular dynamics simulations (MDS) using the AMBER 14 and 18 packages were performed on the docked complexes, bound, and unbound MEV constructs [104,105].MDS are used to determine the stability of the complexes and the interactions between the proteins [106].The proteins were described using FF14SB [107].The topologies were generated using the LEaP module of AMBER 14 [104].Protons and Na + ions were added as counter ions to the complexes, and Cl − was added to the unbound MEV constructs.This was done to neutralize the system in an orthorhombic box of TIP3P water molecules of 8 Å [108].Initial energy minimization was carried out for 10 000 steps (500 steepest descents with 9500 conjugate gradient).However, initial energy minimization failed to run for construct 2. Full energy minimization was carried out for 2000 steps for constructs 1 and 5.The complexes were gradually heated from 0 K to 300 K in a canonical ensemble (NVT) with a Langevin thermostat for 2 ns.The collision frequency applied to the system was 1.0 ps −1 , with the density of the water system regulated with 2 ns of NPT (constant number N, pressure P and time T) simulation.The complexes were equilibrated at 300 K for an additional 2 ns at a pressure of 1 bar.The production stage was run for 100 ns at NVT.The simulations were run using the GPU (CUDA) version of PMEMD provided in AMBER 18 [105,[109][110][111].

Post-molecular dynamics simulations analysis
Post-MDS analysis was performed using the CPPTRAJ and PTRAJ modules in AMBER 18 [105,112].The root mean square deviation (RMSD) and the root mean square fluctuations (RMSF) of the complexes and the MEV constructs were determined.The Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) module in AMBER 18 was used to calculate the endpoint binding free energy of the docked complexes using the formula The CPPTRAJ and PTRAJ modules generated 2000 frames of the complexes and the unbound MEVs [112].VMD v 1.9.3 was used to view the generated structure [69].The Bio3D package was loaded onto RStudio v 4.0.4 and used to perform principal component analysis (PCA) and crosscorrelation analysis [113,114].PCA provides insight into the nature of the clusters and conformational variations that occurred after MDS [115].Through the analysis of pairwise cross-correlation coefficients, cross-correlation analysis generates a dynamical cross-correlation matrix (DCCM) [113].This can be used to determine the magnitude to which the fluctuations within the complexes and MEVs are correlated [113].

Identification of potential virulent outer membrane peptides
A total of 49 potentially virulent MmpL proteins were identified.Following alignment, there were 465 peptides with varying lengths.There were 237 peptides that were determined to have an outer topology.S6).

Population coverage of the T-cell epitopes
The highest number of epitope hits or HLA combinations recognized by the highest percentage of individuals was 5 royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 (7.50%), and the lowest, excluding 0, were 40 to 42 (0.01%) (figure 1a).From 43 through to 60 epitope hits/HLA combinations, the number of individuals was 0 (figure 1a).The global class I, class II and class combined population coverages were 49.96%, 95.43% and 97.71%, respectively (figure 1b; electronic supplementary material, table S7).All the global estimates were above the estimated average (electronic supplementary material, table S7).Papua New Guinea had the highest class combined population coverage (99.90%), while Liberia had the lowest (28.23%) (figure 1b; electronic supplementary material, table S7).

Identification and analysis of B-cell epitopes
A total of 334 B-cell epitopes were identified by ABCpred.These epitopes were then screened for antigenicity and reduced to 138 B-cell epitopes.There were 93 non-allergenic and non-toxic epitopes.Upon completion of conservancy analysis, 5 overlapping B-cell epitopes were identified (electronic supplementary material, table S8).The antigenicity scores of the epitopes ranged from 2.06 (WFWWPMRVR TRPTRTP) to 1.28 (MTPAIAALLGRWFWWP) (electronic supplementary material, table S8).The highest ABCpred score was 0.92 (MMPAIAALLGRWFWWP and MTPAIAALLGR WFWWP), and the lowest was 0.82 (LGRWFWWSPPPF LRAS).The B-cell epitopes were traced back and found to originate from 11 potential virulent proteins (electronic supplementary material, table S6).The virulence scores of the source proteins of the T-and B-cell epitopes ranged from 0.50 (WP_157759966.1) to 1.10 (WP_240273203.1)(electronic supplementary material, table S6).

Multi-epitope vaccine sequence analyses 3.2.1. Construction and structural analyses of the MEV candidate sequences
Five vaccine sequences were constructed, as shown in figure 3.All the candidate sequences were found to be non-allergenic.The antigenicity scores of the candidates ranged from 1.42 (MEV3) to 1.17 (MEV1 and MEV4) (table 2).All the constructs consisted mainly of random coils, followed by extended strands and then alpha helices (table 2).MEV5 had the highest percentage of alpha helices (22.25%) and the lowest percentage of random coils (42.16%) (electronic supplementary material, figure S1).royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 The sequences were refined (figure 4).The PROSA scores for the candidates were negative (table 2; electronic supplementary material, figure S2).The lowest score was −3.98 (MEV4), and the highest was −1.44 (MEV3) (table 2; electronic supplementary material, figure S2).The overall quality factors ranged from 64.60 (MEV4) to 74.16 (MEV5) (table 2; electronic supplementary material, figure S3).Ramachandran plots were generated for the refined candidate structures (electronic supplementary material, figure S4).MEV5 was the only construct with no residues in the disallowed regions and had the lowest percentage of residues in the generously allowed regions (table 2).MEV3 had the highest percentage of residues in the most favoured regions, while MEV2 had the lowest percentage of residues in this region and the highest percentage in the disallowed regions (table 2).

Physico-chemical analysis of the vaccine candidate sequences
The vaccine sequences consisted of varying numbers of amino acids (  3).The N-terminal for all the sequences was considered to be methionine (M).The half-life estimated in mammalian reticulocytes (in vitro), yeast (in vivo) and E. coli (in vivo) were 30 h, greater than 20 h and greater than 10 h, respectively.Gibbs-Helmholtz curves were generated based on the five vaccine candidate structures (electronic supplementary material, figure S5).MEV4 had the highest melting temperature (65.00°C), while MEV3 had the lowest melting temperature (52.70°C) (electronic supplementary material, table S9).Only MEV2 had negative values for ΔH m , ΔC p and ΔG r , as indicated by the difference in the curve (electronic supplementary material, table S9, figure S5B).

Binding interaction of constructed multi-epitope to the TLRs and structural analysis of the TLR-MEV complex
Only three vaccine sequences (MEV1, MEV2 and MEV5) were successfully docked to their respective TLR, resulting in complex 1, complex 2 and complex 3, respectively (figure 6).The best binding energies of the abovementioned complexes were −266.68,−382.36 and −368.67,respectively.Complex 1 consisted of 3 salt bridges, 6 hydrogen bonds and 119 nonbonded contacts (electronic supplementary material, figure S6A).Complex 2 contained 3 salt bridges, 3 hydrogen bonds and 142 non-bonded contacts (electronic supplementary material, figure S6B).Complex 3 had the highest numbers of bonds with 5 salt bridges, 9 hydrogen bonds and 131 nonbonded contacts (electronic supplementary material, figure S6C).The contact maps generated based on the complexes indicate varying degrees of contact between the residues (electronic supplementary material, figure S7).Complex 1 had a minimal score of −3.93, a maximal score of 3.54, an average score of −0.54 and a total score of −658.74 (figure 7a).The minimal, maximal, average, and total scores of complex 2 were −3.59, 3.25, −0.47, and −585.18,respectively (figure 7b).Complex 3 had a minimal score of −3.70, a maximal score of 5.35, an average score of −0.31, and a total score of −331.19 (figure 7c).Complex 1 had the highest number of soluble residues, while complex 3 had the lowest (table 4).Complex 3 had the highest aggregation-prone residues, while complex 1 had the lowest (table 4).

Immune simulations
Immune simulations were carried out for MEV1, MEV2, and MEV5 (figures 8 and 9; electronic supplementary material, figure S8).All the vaccine sequences induced immunoglobin and antibody responses (figures 8a and 9a; electronic supplementary material, figure S8A).However, only MEV2 induced IgG1 + IgG2 and IgG2 and had the highest levels of IgM + IgG and IgM (electronic supplementary material, figure S8A).For all vaccine candidates, the antigen response increased and consequentially decreased within the 0 to 5-  S8A).All vaccine sequences induced the production of memory, non-memory, and B isotype IgM B-cells (figures 8b and 9b; electronic supplementary material, figure S8B).MEV2 generated the lowest number of B isotype IgM and memory B-cells (figure 9b).The production of B isotype IgG1 and IgG2 was negligible (figures 8b and 9b; electronic supplementary material, figure S8B).The population of memory CD8 + T-cells remained constant throughout the simulation for all the sequences (figures 8c and 9c; electronic supplementary material, figure S8C).The induction of nonmemory CD8 + T-cells fluctuated throughout the simulation, and the values were similar (figures 8c and 9c; electronic supplementary material, figure S8C).All the sequences induced the production of memory and non-memory CD4 + T-cells; however, MEV5 induced the lowest number of both types of cells (figures 8d and 9d; electronic supplementary material, figure S8D).The vaccine sequences induced the production of IFN-γ, transforming growth factor beta (TGF-β), IL-2, IL-10, and IL-18 (figures 8e and 9e; electronic supplementary material, figure S8E).The levels of IFN-γ and IL-18 induced were similar for all the sequences; however, MEV5 induced the lowest amount of IL-10 and TGF-β (figures 8e and 9e; electronic supplementary material, figure S8E).

Molecular dynamics simulations
The RMSD values of complex one and the bound MEV were lower than that of the unbound MEV (figure 10a).However, there was an overlap of the RMSD values of the bound and unbound for the third complex (figure 10b).The fluctuations of the backbone of the atoms of the bound MEV1 ranged from 0.55 Å to 16.09 Å, while the fluctuations of the backbone of the atoms of the unbound MEV1 ranged from 0.55 Å to 27.75 Å (figure 10a).The highest value for the fluctuations of the backbone of the atoms for the bound MEV5 was 0.49 Å, and the lowest was 22.74 Å (figure 10b).The fluctuations of the backbone of the atoms of the unbound MEV5 ranged from 0.52 Å to 24 Å (23.9954) (figure 10b).These values indicate better stability of the bound MEVs.Residues within the third complex displayed lower fluctuations than the first complex (figure 10c,d).There was also an overlap between complex one and the unbound MEV1 (figure 10c).

Post-molecular dynamics simulations analysis
MM/GBSA was used to estimate the binding affinities of the vaccine sequences to their respective TLRs within the complex (table 5).royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 good interactions between the TLRs and vaccine sequences.The first complex displayed better binding interactions than the third complex (table 5).Both complexes had negative van der Waals energies, which is favourable (table 5).The electrostatic, gas-phase and solvation free energy contributions are shown to be significant.Principal crosscorrelation captured 94.20% of the variance of the atom positional fluctuations of the first vaccine sequence during MDS (figure 11a).PC1 to PC3 contributed 76% of the total proportion of the variance shown in the eigenvalue plot (figure 11a).The greatest contribution was by PC1 (52.61%),   followed by PC2 (15.99%) and then PC3 (7.40%) (figure 11a).This differed from the principal cross-correlation captured for complex one, which was 92.10 (electronic supplementary material, figure S9A).The total contributions of PC1 to PC3 were estimated to be 72.68%(electronic supplementary material, figure S9A).PC1 contributed 51.77%, PC2 contributed 16.09%, and PC3 contributed 4.82% (electronic supplementary material, figure S9A).The fifth vaccine sequence had a lower proportion of variance than the first, i.e. 93.80% (figure 11b).The three PCs were accountable for 71.91% (figure 11b).PC1 contributed the highest variability, followed by PC2 and PC3, with proportions of 48.93%, 16.28% and 6.70%, respectively (figure 11b).The variance of the third complex was lower, with the 20 PCs capturing 89% of the variance (electronic supplementary material, figure S9B).PC1 to PC3 contributed 64.47% of the variance (electronic supplementary material, figure S9B).Similar to the other plots, PC1 contributed the highest (48.08%), then PC2 (12.98%), and PC3 (6.41%).Residues within the complexes and vaccine constructs were observed to move in similar directions near the diagonal, indicating the intercorrelation of residues (figure 12; electronic supplementary material, figure S10).Various residual-wise loading fluctuations were observed for the complexes and vaccine constructs (electronic supplementary material, figure S11).

Discussion
BU is a disease that burdens individuals and communities physically, economically, and socially.The progression of technology has given rise to new methods that can supplement existing vaccine methodologies while lowering production costs and time.One such method, i.e. immunoinformatics, was used in this study to generate multiple in silico vaccine constructs, which may be capable of inducing protective immune responses against M. ulcerans infection.The recognition of a vaccine by the immune system of the host is essential for its effectiveness [116].The host immune system is comprised innate and adaptive immunity.T-cells and B-cells play crucial parts in removing infection in the host through the adaptive immune response [117].The role of CD8 + or cytotoxic T-cells is to recognize and destroy infected cells, while CD4 + or helper T-cells are involved in the differentiation of B-cells and cytokine induction [1].This constitutes the cell-mediated response; the humoral antibody response consists of B-cells which are responsible for the generation of antigen-specific antibodies [1,117].These antibodies are responsible for the neutralization of infection prior to its spread [118].
Upon infection with M. ulcerans, immune suppression may occur due to the toxin mycolactone [119].The immune response of the host has been found to correspond to the stage of the disease, with a predominantly TH2 response during the initial progression of the disease and a TH1 response during the later or healing stages of the disease [120].This TH2 response has been coupled with increased IL-10, while the TH1 response corresponds with the secretion of high levels of IFN-γ [120].IFN-γ is critically involved in protection against mycobacteria, with a lack of IFN-γ in patients possibly contributing to the persistence of M. ulcerans, the painless characterized by the disease and the failure of the disease to respond to treatment [119].Higher levels of IFN-γ were observed in patients in ulcerative rather than pre-ulcerative phases and in late BU or healed patients when compared to controls [120].This differed from the findings of decreased IFN-γ production in patients with ulcerative lesions compared to nodules [121].This was the opposite for the IL-10 cytokine, with increased levels found in patients with ulcerative lesions than with nodules [121].The generation of conserved CD8 + T-cell epitopes with overlapping CD4 + epitopes capable of inducing IFN-γ is attractive.
Antigenicity is another desirable property, as it is the ability of an antigen to interact specifically with the binding site of an antibody molecule [122].The epitopes identified in this study have varying antigenicity scores.The non-allergenic and non-toxic nature of the epitopes, and corresponding vaccine constructs, indicate a lower probability of the induction of adverse effects.The epitopes in this study share several properties with previously identified epitopes [37][38][39].This includes antigenicity, non-allergenicity, non-toxicity, cytokine-inducing capabilities, and some degree of population coverage.However, none of the final epitopes generated in this study were shared with prior studies [37,38].This may be due to the inclusion of other strains of M. ulcerans and the use of a different protein, i.e.MmpL.While our previous study used the same strains of M. ulcerans [39], there were a larger number of epitopes which displayed the abovementioned desirable properties identified in this study.The resulting vaccine candidates displayed similar structural properties; however, overlap of the RMSD values was only observed in this study.This may be linked to the greater size of these vaccine candidates or the quality of the structure.
Different species of mycobacteria are observed to have varying numbers of MmpL proteins, with the H37Rv stain  royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 of M. tuberculosis having 13, M. leprae with 5, M. smegmatis with 16, and M. abscessus with approximately 31 putative transporters [41,123].However, 6 MmpL transporters (MmpL 3/4/7/10/11 and 13a/13b) are thought to form the core set of this protein, as there is a high degree of syntenic conservation in both slow-growing and rapidly growing mycobacteria [40,41].This was observed by examining the MmpL proteins of M. tuberculosis, M. leprae, M. avium, M. marinum, M. smegmatis and M. abscessus [40,41].Little is known regarding their structural organization, topological architecture and mechanism of action in mycobacteria [123].MmpL proteins are found to indirectly contribute towards virulence through their effect on bacterial survival [41].These proteins are also associated with drug resistance through their impact on the efflux of antibiotics from the periplasm [124].The use of epitopes derived from these proteins in the vaccine candidates is promising.Another promising characteristic is the high population coverage of these epitopes in most endemic regions.WHO has reported incidences of BU in 33 countries, including Australia, Benin, Cameroon, Central African Republic, Congo, Côte d'Ivoire, Democratic Republic of the Congo, Gabon, Ghana, Guinea, Japan, Liberia, Nigeria, Papua New Guinea, Togo, French Guiana and South Sudan [14,58].The regions identified as endemic were selected for population coverage analysis; however, no population coverage was generated for Ghana.This may be due to the lack of the use of HLA alleles within this study, as the Accre Asutuare Akan population did not show any alleles under the gold or gold and silver population standards [46].The Ga-Adangbe population of Ghana was missed when identifying the most frequent HLA alleles in the population; however, this can be addressed in later studies.Togo and Benin were not featured on the population coverage webserver [57].The inclusion of West Africa was done to encompass Benin and Ghana.The class combined population of West Africa (98.45%) was high.When compared to the study by Nain et al. [38], the global class I population coverage (49.96%) and the class combined population coverage (97.71%) were   The epitopes that displayed the abovementioned desirable properties were combined, together with linkers and adjuvants, to form several vaccine candidate sequences.The addition of the adjuvants was done to increase the stimulation of the immune response, thereby increasing the immunogenicity of the vaccine construct [117,125].TLR agonists are shown to be promising adjuvants due to their ability to induce antigen-presenting cells (APCs) [126].The lipoprotein LprG was one of the adjuvants used, as it is a TLR2 agonist.Brightbill et al. [127] found that the use of microbial lipoproteins to activate TLRs may lead to the induction of innate defence mechanisms against royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 pathogens.LprG has been observed to aid the immune system in recognizing M. tuberculosis [72].RpfE was also used as an adjuvant due to its role as a TLR4 agonist.RpfE from M. tuberculosis is involved in the modulation of dendritic cell (DC) function, which facilitates the differentiation of CD4 + T-cells to Th1 or Th17 [126].hBD2 was chosen as an adjuvant due to its antibacterial activity [128].Defensins defend against pathogenic infection by linking both the adaptive and innate immune systems through T-cells and DCs [128].Linkers serve to separate epitopes and facilitate folding to achieve epitope display [129].The EAAAK linker is defined as a rigid linker and plays a role in protein stability by maintaining a distance between domains [130].It is advised that it should be used after an adjuvant [130].The AAY linkers are involved in the formation of natural epitopes while also limiting junctional epitope formation [131].This was also observed with the   royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 GPGPG linker, which also aided in increasing the specificity of the immune response [132].The KK linker was used to preserve the independent immunological activity of the epitopes [133].Linkers have an impact on the hydrophobicity, secondary structures, sensitivity to protease, and the interactions within the regions of the vaccine constructs [134].Some of these characteristics were further analysed.The vaccine constructs displayed several physico-chemical properties.Proteins with molecular weights ≤110 kDa are desirable in vaccine development due to rapid purification [135,136].All vaccine candidate sequences were below 110 kDa.It is found that the higher the aliphatic index, the more thermostable the protein is [137].The varying levels of thermostability of the constructs are positive, as it was found that thermostable vaccines would not require cold temperature storage, thereby potentially decreasing medical costs and productivity losses while increasing the reach of the vaccine [138].This, coupled with the high melting temperatures, is advantageous.The negative GRAVY values, indicative of hydrophilicity, indicate strong interaction with water molecules [139].This allows for easy dispersion and mobility of the proteins [140].All the vaccine candidate sequences were found to be unstable as per the II; however, this value is based on the dipeptide composition of the sequence and does not factor in different environmental conditions nor the secondary and tertiary structures of the protein [88,141].It was found that in vitro conditions also impact the stability of a protein [141].Stability of the vaccine candidates may be modulated by ensuring optimal environmental conditions [142].The estimated half-life of the candidates in different organisms is positive, as it may ensure that the constructs will remain viable long enough to induce an effective immune response.However, this would need to be experimentally validated.
The CAI is used to measure codon usage bias, with it correlating positively to gene expression [143].The high CAI values for each vaccine candidate indicate that they may adapt well to E. coli (strain K12) [143].The GC content was within the range of 30-70%, indicating a high potential for reproducibility and good protein expression [144].The simulated immune response of the vaccine candidates displayed the production of T-cells, B-cells, cytokines and immunoglobins.Several studies involving the immunization of M. ulcerans-specific candidates in animal models have observed the induction of specific immunoglobins, such as IgG, IgG1, IgG2a and IgG2b [25,[27][28][29]32,34,36]. The release of immunoglobins has been observed in patients, healthy family members and the sera of patients with TB from nonendemic areas [145].High levels of IgG responses were noted in all three groups, while IgM antibody responses were more distinct in patients [145].IgG production in FVB/N mice has been observed during the healing stage of M. ulcerans infection [146].It was hypothesized that this immunoglobin might contribute towards the control of M. ulcerans infection in this model [146].The production of IgM, IgG1 and IgM and IgG are promising.However, only one dose of the vaccine candidates was run, with vaccines generally requiring more than a single dose before full protection is achieved.
The secondary structures of proteins impact protein folding and provide information regarding the functions and interactions of the protein [147].The large proportion of random coils in the vaccine candidates indicates the high flexibility of the proteins [148].The presence of alpha-helices is advantageous, as it indicates the maximum possible number of hydrogen bonds, indicative of stability [149].Extended strands are common structural occurrences in proteins, and isolated extended strands often share characteristics of loops and β-sheets [150].These secondary structures each affect the overall protein structure.The protein structures showed marked improvement following refinement, indicated by the Z-score and the Ramachandran plots.The negative Z-score indicates that the model quality of the proteins is good [84].This was further emphasized with most residues lying within the most favoured regions of the Ramachandran plots.However, overall quality factors of 95% or higher are considered to show good high-resolution structures [151].The lower quality factors of the vaccine candidates will need to be addressed in further studies.
TLRs are crucial in recognizing pathogenic signals and activating innate and adaptive immune responses [152].Through the interaction of the TLR and vaccine antigens, Th1 cytokines may be produced [116,127].TLR2 and TLR4 royalsocietypublishing.org/journal/rsob Open Biol.13: 230330 are expressed on the cell surface and primarily recognize lipids, lipoproteins and proteins that form the microbial membrane [116,153].TLR2 is observed to play a role in the recognition and induction of the host immune response to mycobacteria [154,155].This is done via the activation of the MyD88, IRAK4, TRAF6, TAK1, and NF-kB dependent and NF-κB independent signalling pathways [152,153].TLR4 has been observed to induce host responses against M. tuberculosis; this includes the induction of apoptosis and antiinflammatory responses in macrophages [152,156].
The bonds within the TLR-vaccine complexes consisted mainly of hydrogen bonds and salt bridges.Both of these types of interactions are involved in protein binding [157].The presence of hydrogen bonds within proteins contributes to protein stability [158].However, the contribution of each hydrogen bond is dependent on its distance and geometry [158].Salt bridges are also found to contribute towards protein stabilization [159].They are found to play a role in the folding of globular proteins, and may contribute towards membrane protein stability, conformational specificity, and the positioning of critical functional groups [160,161].
Another key characteristic within the complexes is the smaller proportion of aggregation-prone residues, as aggregation may lead to the reduction of production yield and final product activities and may trigger unpredictable immune responses in patients [162,163].
MDS was used to further analyse the interactions between the proteins and ligands and the conformational modification of the proteins at the atomic level [164].The RMSD and RMSF values indicate the stability of the two complexes at the end of the simulation.However, as MDS often emulates the absolute atomistic level molecular behaviour in near-cytosolic conditions through the use of the relevant force-fields and water model, further physiological validation through in vitro and ex vivo studies is required [165,166].MMGBSA estimates the free energy of the binding of small ligands to biological macromolecules [167].In this study, it was used to estimate the free energy of the binding of the vaccine constructs and TLRs.The energy estimates showed significant contributions from the van der Waals forces, electrostatic energy, and gas-phase energy.Van der Waals interactions often correlate to the number of atoms, with lower van der Waals energies correlating to larger peptides [168].These interactions are an important factor for binding affinity [168].Hydrogen bonds and salt bridges are classified mainly as electrostatic interactions [158,160].This may have contributed towards the significant electrostatic energy contributions.
BU remains a burden on not only affected individuals but communities as a whole.Early diagnosis and treatment prevent the development of severe symptoms.The prevention or control of the infection may also be beneficial.However, control and prevention methods remain lacking.The epitopes, and by extension, the vaccine constructs and complexes generated in this study displayed several appropriate characteristics.These characteristics suggest the ability of these constructs to generate a potentially protective immune response against M. ulcerans infection.This study serves to further the scope of existing screened epitopes.However, further experimental validation is required as this work was only done computationally.
Ethics.This work did not require ethical approval from a human subject or animal welfare committee.

Figure 1 .
Figure 1.The population coverage of the CD8 + and CD4 + epitopes for the world and identified endemic regions.(a) The combined class HLA coverage for the world population.(b) The population coverage of individual and combined epitopes for the significant regions.

Figure 2 .
Figure 2. Superimposition of the most suitable epitopes and the binding interactions of the most promising epitopes.(a) The CD8 + epitopes are superimposed, with each colour representing a different model.The respective HLA allele structures are shown in grey.(b) The LigPlot depicting the interaction between the residues of the receptor and ligand (RWFWWPLRM).Hydrogen bonds are depicted in green and salt bridges in red.The residues involved in the hydrogen bonds are shown in green for the ligand and blue for the epitope.(c) The CD4 + epitopes are superimposed, with each colour representing a different model.The HLA allele structures are shown in grey.(d ) The depiction of the RWFWWPMRVRTRPTR epitope with the HLA allele.The hydrogen bonds between the donor and acceptor are indicated by pink and green, respectively.White indicates neutral atoms.

Figure 3 .
Figure 3. Schematic representation of the five multi-epitope vaccine constructs and a legend indicating the epitopes and linkers.(a) The schematic model of vaccine construct one, utilizing all the linkers.(b) The schematic model of vaccine construct two, utilizing all the linkers.(c) The schematic model of vaccine construct three, utilizing all the linkers.(d ) The schematic model of vaccine construct four, utilizing the EAAAK, AAY and KK linkers.(e) The schematic model of vaccine construct five, utilizing the EAAAK, AAY and KK linkers.

Figure 5 .
Figure 5.The in silico cloning map of the pET-28a(+) plasmid, with the optimized DNA sequence shown in red.(a) The cloning map of the optimized DNA sequence of the first MEV construct.The sequence is located between HindIII (173) and BamHI (762).(b) The cloning map of the optimized DNA sequence of the second MEV construct.The sequence is located between HindIII (173) and BamHI (2105).(c) The cloning map of the optimized DNA sequence of the third MEV construct.The sequence is located between HindIII (173) and BamHI (1667).(d ) The cloning map of the optimized DNA sequence of the fourth MEV construct.The sequence is located between HindIII (173) and BamHI (2033).(e) The cloning map of the optimized DNA sequence of the fifth MEV construct.The sequence is located between HindIII (173) and BamHI (1595).

Figure 6 .
Figure 6.The crystalline structures of the MEV-TLR complexes.The multiepitope chain is shown in red and the respective TLR is shown in blue.(a) The first MEV-TLR2 complex.(b) The second MEV-TLR4 complex.(c) The third MEV-TLR4 complex.

Figure 7 .Figure 8 .
Figure 7.The solubility and aggregation propensity of the MEV-TLR complexes.The graphical representation model shows the soluble residues in red, the aggregation-prone residues in blue, and residues with no predicted influence shown in white.(a) The model based on the first MEV-TLR2 complex.(b) The model based on the second MEV-TLR4 complex.(c) The model based on the third MEV-TLR4 complex.

Figure 9 .
Figure 9.The immune simulation results from C-IMMSIM of MEV construct five.(a) The induced antigen and immunoglobin responses.(b) The B-cell population: total count, memory cells, and sub-divided in isotypes IgM, IgG1 and IgG2.(c) The CD8 + cytotoxic lymphocyte count.(d ) The CD4 + T-helper lymphocyte count.(e) The induced cytokine response.The inset plot displays the danger signal coupled with IL-2.

Figure 10 .
Figure 10.The molecular dynamics simulations of the MEV-TLR docked complexes, MEV bound to TLR and the unbound MEV.(a) The RMSD plot was generated based on complex one.The bound MEV is shown in red, the unbound MEV in grey and the TLR2-MEV complex in blue.(b) The RMSD plot was generated based on complex three.The bound MEV is shown in grey, the unbound MEV in blue and the TLR4-MEV complex in red.(c) The RMSF plot was generated based on complex one.The bound MEV is shown in grey, the unbound MEV in blue and the TLR2-MEV complex in red.(d ) The RMSF plot was generated based on complex three.The bound MEV is shown in grey, the unbound MEV in blue and the TLR4-MEV complex in red.

Figure 11 .
Figure 11.The PCA plots generated for the MEV constructs in eigenvalue rank; PC2 versus PC1, PC2 versus PC3, PC3 versus PC1.The colours are based on order of time and the cumulative variability at each data point.(a) The PCA plot based on the first MEV construct.(b) The PCA plot based on the third MEV construct.

Figure 12 .
Figure 12.The dynamical cross-correlation map generated based on the MEV constructs.The blue regions indicate the residues moving in a singular direction, while the pink regions indicate that the residues moved in opposite directions.(a) The DCCM generated from MEV1.(b) The DCCM generated from MEV2.

table 3
).All the vaccine sequences had theoretical pI values ≥7, II values ≥40 and negative GRAVY values, indicating their basic, unstable, and hydrophilic natures (table 3).The molecular weight of the sequences ranged from 58.74 kDa (MEV5) to 76.53 kDa (MEV1) (table 3).The aliphatic index values for the constructs indicate varying levels of thermostability, with MEV5 having the greatest value (69.84) and MEV2 the lowest (59.27) (table

Table 1 .
The CD8 + and CD4 + epitopes, along with the best energy values.The CD8 + epitopes consist of 9 amino acids, and the CD4 + epitopes consist of 15 amino acids.
day period (figures 8a and 9a; electronic supplementary material, figure

Table 3 .
The physico-chemical properties of the five vaccine candidate sequences.pI = isoelectric point, II = instability index, GRAVY = grand average of hydropathicity.

Table 4 .
Aggregation analysis regarding the three complexes and individual chains A and B. AP = aggregation prone, NI = negligible influence.
lower, but the class II population coverage was higher (95.43%).The individual and global population coverages indicate the potential of some degree of effectiveness in the endemic areas.

Table 5 .
The energy composition profile (kcal mol −1 ) based on the MM/ GBSA analysis of the two vaccine complexes.ΔE VDW = the van der Waals contribution from MM. ΔE ELE = the electrostatic energy calculated by the molecular mechanics (MM) force field.ΔG gas = the gas-phase energy contribution.ΔG solv = the solvation free energy contribution.ΔG bind = the endpoint binding energy of the interaction of the complex.