Introduction

Human herpesvirus 5 (HHV-5), also known as the Human Cytomegalovirus (HCMV), belonging to the betaherpesvirus subfamily is the cause of one of the most prevalent diseases affecting immunocompromised hosts [1]. Belonging to the same subfamily, Human herpesvirus 6 (HHV-6) is an immunotropic virus that specifically attacks cells involved in both, humoral and cell-mediated immune responses [2]. Among the Human herpesviruses, HHV-5 HHV-6 co-infection is the one most frequently associated with post-immunosuppression complications [3].

It has been observed that HHV-5 and HHV-6 remain in a latent state after the primary infection until reactivation due to an immunosuppressive condition. Spreading through body fluids, HHV-5 has a global seroprevalence of about 94% [4]. Congenital infection may cause complications including neurological issues, hearing loss, visual impairment, and death. Among healthy adults, the primary infection may cause fatigue, fever, muscle aches, and rarely mononucleosis. The most common symptoms following primary infection in immunocompromised adults include pneumonia, vision loss, encephalitis, and digestive system issue [5]. HHV-6 is usually prevalent among infants, and manifests in the form of skin rashes known as roseola. Other symptoms include high fever, encephalitis, meningitis, liver dysfunction, or abnormalities such as granulocytopenia, wherein a very low white blood cell count is observed [6]. Co-infection of HHV-5 and HHV-6 mostly occurs due to their reactivation during immunosuppressive conditions such as chemotherapy or organ transplantation. Studies show that higher activation rates of HHV-5 can be observed in cases with preliminary HHV-6 infection [3]. The immunomodulatory and immunosuppressive effects of HHV-6 can induce a predisposition to HHV-5 infection or its reactivation. With their pathologies having a synergetic effect, it can be understood that a co-infection would have worse clinical implications than infection with either one of the viruses [3]. This synergetic co-infection has also been shown to increase the risk of fungal and bacterial infections [3, 7].

The current strategy to treat an HHV-5 infection is a 6-month prescription of the antiviral drugs Valganciclovir and Cidofovir [1]. This strategy is limited due to not being cost-effective and the safety concerns associated with drugs [1, 8]. For HHV-6, antiviral drugs Foscarnet and Valganciclovir or combinations of them are used [9]. This too is limited by their toxicity, antiviral resistance and drug-drug interactions [10]. Another major disadvantage is the fact that antiviral drugs only target lytic infections and not latent ones [11]. Vaccination, on the other hand, can lead to the complete eradication of the viruses on a geographical level. While vaccination does not prevent the reactivation of latent infections, it would significantly decrease the rates of primary infections [12]. Various approaches for HHV-5 vaccine development were considered in previous studies, ranging from live attenuated vaccines to DNA, RNA or protein subunit-based vaccines [11]. While live attenuated vaccines such as the Towne attenuated strain [12] or the Towne-Toledo recombinant strain [13] showed significant immunogenicities, they are limited by their inherent risk of causing the disease they are supposed to provide immunity from [14]. DNA vaccines based on the pp65 and gB [15, 16] induced immune responses in 47.5–68% of the seronegative subjects and an RNA vaccine based on gB and pp65/IE1 fusion protein [17] showed antibody and T-cell responses in almost 97% of the subjects after the second dose. A recombinant protein vaccine derived from the envelope gB showed only 50% efficacy against a primary HHV-5 infection [18]. Though vaccine development for HHV-6 did not receive the same scrutiny that HHV-5 immunization endeavours did, efforts were put in to identify potential vaccine candidates based on viral protein complexes. A tetramer complex of envelope glycoproteins gH, gL, gQ1 and gQ2 was observed to induce immunization in mice against HHV-6 [19]. This tetramer targets the human CD134 host receptor, which is expressed on activated T lymphocytes. This interaction determines the HHV-6B cell tropism and is necessary for HHV-6B entry into the vulnerable cells.

Till date, there have been no attempts to develop a vaccine against a co-infection of HHV-5 and HHV-6. In this scenario, immunoinformatics tools were used to construct a multi-epitope vaccine candidate in order to provide immunity against a synergetic co-infection of the two betaherpesviruses. In essence, the present computational vaccine design study with aid of unique glycoprotein is expressed on the virus envelope of human herpesvirus is indeed an important supplement in the scientific literature of glycoprotein based therapeutics. The strain such as AD169 for HHV-5 and U-1102 for HHV-6 were chosen based on their quality, completeness and validity. We hope that present study employing immunoinformatics pipeline would provide a safer platform for vaccine development and provide clue for the experimental biologist working in this field.

Methodology

Amino acid sequence retrieval

In this study, an ensemble of computational tools was adopted to predict ideal vaccine candidates for the HHV-5 and HHV-6 using viral envelope glycoproteins. For the epitope prediction, a total of 9 envelope glycoproteins from HHV-5 and 7 envelope glycoproteins from HHV-6 were chosen. Each of these glycoproteins was selected due to their roles in viral attachment to the host cells and the fusion of viral and host plasma membranes [20,21,22,2325, 26]. The protein sequences were retrieved from the UniProt database (https://www.uniprot.org/uniprot/P06473). Of note, the strains AD169 (HHV-5) and U-1102 (HHV-6) envelope glycoprotein sequences were chosen because they were longest and experimentally verified sequence with highest annotation scores.

Linear B-cell epitope prediction

B-cell epitope prediction was done using the ABCpred tool (http://crdd.osdd.net/raghava/abcpred/). The tool with a prediction accuracy of 65.93% utilizes a Recurrent Neural Network trained using a dataset consisting of 700 B-cell epitopes and 700 non-B-cell epitopes retrieved from the SwissProt database [28]. For the current study, a score threshold of 0.7 and an epitope length window of 10 were chosen for the predicted epitopes.

T-cell epitope prediction

Cytotoxic T-cell epitope prediction

Cytotoxic T lymphocyte (CTL) epitope prediction from the viral glycoprotein sequences was done using the NetMHCpan 4.1 tool (https://services.healthtech.dtu.dk/service.php?NetMHCpan-4.1). NetMHCpan 4.1 employs an Artificial Neural Networks (ANNs) to predict the peptide binding to any MHC molecule whose sequence is known. For the current analysis, 12 MHC I Allele subgroup belonging to the 2 major HLA (A and B) super types were selected. The predicted peptide length set to 9mer (Table S1). An epitope identification threshold of 0.5% was set for strong binders and 2% for weak binders [29].

Helper T-cell epitope prediction

Epitopes specific to Helper T lymphocytes (HTL) were predicted from the viral protein sequences using NetMHCIIpan 4.0 tool (https://services.healthtech.dtu.dk/service.php?NetMHCIIpan-4.0). A total of 27 MHCII alleles subgroup belonging to the 3 HLA (DR, DP and DQ) super types were considered (Table S1), as the binding efficacy of HTL epitopes to HLA-DR is a key factor in the immunogenicity of T-cell epitopes [30]. 15mer epitopes were predicted with the selection threshold at 1% for strong binders and 5% for weak binders [29].

Epitope characterization

Toxicity evaluation

ToxinPred tool (http://crdd.osdd.net/raghava/toxinpred/) which utilizes a Support Vector Machine trained using a dataset of toxic and non-toxic peptides was used to evaluate the toxicity of each predicted BCL, CTL and HTL epitopes along with important physicochemical properties like charge, isoelectric point (pI), hydrophobicity, etc. [31]. The understanding of vaccine’s physicochemical qualities is of useful to gain insight into the vaccine construct interaction with the environment. For instance, theoretical pI of a protein would be helpful in order to choose and improve the protein purification techniques.

Allergenicity prediction

The allergenicity of the epitopes was checked using the AllerTop 2.0 server (https://www.ddg-pharmfac.net/AllerTOP/) which uses a k-nearest neighbor (kNN) classifier trained on 2427 allergens and an equal number of non-allergens [32].

Antigenicity prediction

Antigenicity gives a measure of an antigen’s abilty to bind to B-cell and T-cell receptors, thereby eliciting an immune response. To identify probable antigenic epitopes, Vaxijen 2.0 server was used with a threshold of 0.4 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html). Vaxijen utilizes alignment independent prediction (89% accuracy) based solely on the physicochemical properties of the proteins [33].

CTL Immunogenicity prediction

Immunogenicity evaluation for the predicted epitopes was done using IEDB class-I immunogenicity tool (http://tools.iedb.org/immunogenicity/) run with default parameters. The tool utilizes amino acid properties and their position within the peptide to make the immunogenicity prediction for a peptide-MHC complex [34].

Vaccine construction

To elicit an immune response, protein subunit vaccines must consist of an antigenic component from the pathogen. To accomplish this, the screened CTL, HTL and BCL epitopes along with an adjuvant were combined in a sequence using appropriate linkers. An adjuvant is any component that induces humoral or cellular immune response against an antigen [35]. Bacterial proteins with potential TLR agonist activity were considered as the adjuvants for the vaccine candidate [36]. The vaccine construct sequence begins with the adjuvant followed by BCL epitopes, with the first BCL epitoped joined to the adjuvant via an EAAAK linker. The BCL epitopes were attached to each other using KK linkers, and AAY linkers were used to attach CTL epitopes to the sequence, followed by HTL epitopes attached using GPGPG linkers.

Population coverage

Due to ethnic and geographic variations in the Human leukocyte Antigen (HLA) alleles, estimating the percentage of individuals covered by the respective HLA alleles is essential for ensuring vaccine efficacy. To determine the global population coverage, the IEDB population coverage analysis tool (http://tools.iedb.org/population/) was utilized with HLA class I and class II combined [37]. The analysis was carried out for T-cell epitopes along with their corresponding HLA alleles.

Physicochemical feature prediction

ProtParam tool (https://web.expasy.org/protparam/) was used to predict the physicochemical properties including atomic composition, amino acid composition, aliphatic index, molecular weight (kDa), extinction coefficient, theoretical pI, grand average of hydrophobicity (GRAVY), estimated half-life, and instability index of the constructed vaccine [38]. For performing a comparative evaluation of the physicochemical properties of the construct, postive controls C1 (Human Cytomegalovirus peptide vaccine) [39], C2 (typhoidal Salmonella serovars) [40], C3 (SARS-Cov-2 glycoprotein) [41] were chosen.

Secondary structure prediction

The secondary structure prediction for the vaccine construct was done using PSIPRED 4.0 (http://bioinf.cs.ucl.ac.uk/psipred/) which utilizes PSI-BLAST (Position-Specific Iterated-BLAST) to predict the secondary structure and an artificial neural network to predict the alpha helices, beta sheets and coils of the construct [42].

Tertiary structure prediction and its validation

Robetta server (https://robetta.bakerlab.org/) was utilized for predicting the tertiary structure of the constructed vaccine [43]. Validation for the predicted structure was done using PROCHECK, Verify 3D and ERRAT tools from the SAVES 6.0 server (https://saves.mbi.ucla.edu/). PROCHECK tool enables the construction of Ramachandran plots to evaluate the generated structure models [44]. The overall quality factor was determined using ERRAT [45]. Verify 3D utilizes DSSP- a hydrogen bond estimation algorithm for further validation that works by calculating the most likely secondary structure assignment for a given 3D structure [46].

Conformational B-cell epitope prediction

Conformational/Discontinuous B-cell epitopes arise due to conformational changes brought about by the physicochemical folding in the tertiary structure. Conformational epitope prediction was done using the ElliPro tool (http://tools.iedb.org/ellipro/) run with default parameters. Based on the geometrical properties of the construct, Ellipro utilizes Thornton’s method along with a residue clustering algorithm to make the predictions, and a Protrusion Index (PI) score is assigned to each predicted epitope [47].

Vaccine candidate-Immune receptor molecular docking analysis

In silico evaluation of the binding correlation between the vaccine candidate and an immune receptor was performed using the ClusPro server (https://cluspro.bu.edu/login.php) [32]. TLR4 was the receptor chosen, its PDB structure was retrieved from the PDB database and prepared for docking using PyMol visualization software. The interaction between the vaccine and the TLR4 was analysed to gain insight into the binding pattern.

Immune simulation

Immune simulation was performed for the vaccine candidate using the C-ImmSim server (https://kraken.iac.rm.cnr.it/C-IMMSIM/). C-ImmSim is an agent-based model implementation, which utilizes position-specific scoring matrices (PSSM) and machine learning models for epitope and immune response predictions defining both humoral and cellular responses of the immune system [48]. Three injections were given at 1, 84 and 168 simulation time steps (each 4 weeks apart), and the simulation was run for a total of 1050 time steps. Each injection contained 1000 vaccine proteins with no LPS.

Codon optimization and in-silico cloning

The amino acid sequence of the vaccine construct was converted to its nucleotide sequence and the resulting codons were adapted for efficient expression in E.coli K12, using the Java Codon adaptation tool (Jcat) (http://www.jcat.de/). Using SnapGene the optimized codon sequence was cloned into a common expression vector pET-28a ( +).

Result and discussion

Viral protein selection

The sequence of the viral envelope glycoproteins gB, gH, gL, gM, gN, gO, UL128, UL130, UL131A of HHV-5 (strain AD169) and gB, gH, gL, gM, gN, gQ1, gQ2 of HHV-6 (strain U-1102) were retrieved from the Uniprot database. Glycoprotein B (gB) present in both the viruses, plays a major role in virus entry into the host cell and intercellular spread [20, 21]. HHV-5 glycoproteins gH and gL form two alternative complexes, a trimer of gH/gL/gO which binds to the PDGFR-α on fibroblast surfaces [22] and a pentamer of gH/gL/gO/UL128/UL130/UL131A which is involved in the virion internalization into monocytes [23]. In HHV-6, a tetramer complex of gH/gL/gQ1/gQ2 along with gB is required for the fusion of viral and host cell membranes [21]. In both HHV-5 and HHV-6, gM and gN form a complex that is essential for membrane fusion and intracellular transport [24, 25]. The gM/gN complex has also been observed to induce the production of HHV-5 neutralizing antibodies in vitro [26] while gN alone has implications in protecting the virus from antibodies [27].

Linear B-cell epitope prediction

B-cell epitopes are parts of the antigen that bind to the antibodies, allowing B-cells to neutralize pathogens contributing to humoral immunity [49]. Using the ABCPred tool, epitope prediction was carried out with default threshold and epitope length set to 10mer, resulting in predicted 577 BCL epitopes. Out of these, 166 epitopes with a prediction score greater than 0.7 were selected. These were subjected to screening by checking toxicity and allergenicity via ToxinPred and AllerTop, respectively. This yielded 49 non-toxic and non-allergenic epitopes. The final 7 BCL epitopes were selected based on antigenicity scores predicted using Vaxijen. These include RHRDVQHGRR, RRRDVGDVKS, LTFSDRETLF, VKDQWHSRGS, ALLLKFNNLG, VTALSFRLVA and VKSTPPPEDK. The details of the BCL epitopes are illustrated in Table 1.

Table 1 List of screened BCL epitopes and their corresponding immunogenic characteristics

T-cell epitope prediction

Cytotoxic T-cell epitope mapping

Cytotoxic T lymphocytes (CTLs) play a vital role in cell-mediated immunity destroying infected cells via antigen recognition. T-cell epitopes are the peptides that activate CTLs by binding to MHC-I proteins, thereby enabling recognition. NetMHCpan 4.1 server was used to predict CTL epitopes from the viral protein sequence, yielding 5505 CTL epitopes. Out of these 1241 epitopes were classified as strong binders. Here 286 epitopes with %Rank-EL less than 0.5 were selected for further analysis. These epitopes were screened based on their predicted toxicity, allergenicity, and antigenicity scores yielding 91 epitopes. Further screening was performed, to select epitopes with antigenicity scores greater than 1.0 and immunogenicity scores greater than 0.30, yielding the following 7 epitopes DVIDVQYRF, ALSFRLVAL, QLVDLTLNY, YSNIGFLLY, ALSFINVTV, LLRHHFHCL and HMFFTNLTF for further analysis (Table 2).

Table 2 List of screened CTL epitopes and their corresponding immunogenic characteristics

Helper T-cell epitope mapping

Helper T lymphocytes (HTLs) play an indispensable role in giving rise to the adaptive immune response, by utilizing signalling molecules to activate B-cells, cytotoxic T-cells and macrophages [50]. HTL epitopes were predicted using the NetMHCIIpan 4.0 server, yielding 6329 epitopes. Out of these 689 were classified as strong binders. A total of 327 epitopes with a %Rank-EL less than 1.0 were selected for further characterization. 67 Non-toxic and non-allergenic epitopes were further screened, to select epitopes with antigenicity scores greater than 1.0, yielding the following 7 epitopes LDFNYLDLSALLRNS, YFEINDLKAVNLSAN, SFFAFQKIHPNLKGT, IVHFSYSTKNTGPMP, ALSFRLVALGAFAYC, VEALLLKFNNLGIQT and IDPLENTDFRVLELY for further analysis (Table 3).

Table 3 List of screened HTL epitopes and their corresponding immunogenic characteristics

Vaccine construction

A total of 21 epitopes (7 BCL, 7 CTL and 7 HTL) were used to make the final vaccine construct (Fig. 1). Appropriate linkers were used to connect the epitopes sequentially. For instance, an EAAAK linker is used to connect the adjuvant and BCL epitopes, a KK linker to connect BCL epitopes, an AAY linker to connect CTL epitopes and finally, a GPGPG linker to connect HTL epitopes. Out of the candidate bacterial protein adjuvants considered, CobT protein from Mycobacterium paratuberculosis, imparted the highest immunogenicity and hence it is chosen as the adjuvant for the final vaccine construct (Table 4).

Fig. 1
figure 1

Schematic representation of the multi-epitope vaccine construct

Table 4 Physicochemical properties of the vaccine construct

Population coverage analysis

The frequency of different HLA alleles varies with ethnicities and geographic regions. This necessitates determining the percentage of the global population covered by HLA alleles of the predicted epitopes. The results predict 99.95% global population coverage for the constructed vaccine (Table S2, S3 and Figure S1).

Physicochemical features and secondary structure prediction

ProtParam tool was used to determine the physicochemical properties of the final vaccine construct and the positive controls C1, C2 and C3. The predicted features for the final construct (Vaccine + CobT) include a length of 660aa, a molecular weight of 69.52 kDa indicating good antigenicity and a theoretical PI of 8.96 (basic behaviour). The instability index of the protein was calculated to be 23.70, classifying the protein as stable. The half-life of the protein was estimated to be 30 h in mammalian reticulocytes (in vitro), > 20 h in yeast (in vivo), and > 10 h in Escherichia coli (in vivo). An aliphatic index of 92.12 indicates moderate stability of the protein in vivo. The grand average of hydropathicity (GRAVY) score of -0.073 indicates a slightly hydrophilic nature of the protein. These properties are similar to those predicted for the controls C1, C2, and C3 (Table 4). The secondary structure prediction for the vaccine was performed using the PSIPRED tool, which indicated the construct to be 16.89% alpha-helices, 37.08% strands and 46.01% coils (Fig. 2).

Fig. 2
figure 2figure 2

Secondary structure of the final vaccine construct (a) Sequence plot (b) PSIPRED cartoon

Tertiary structure prediction and its validation

The tertiary structure of a protein provides different structural insights and is predicted via homology modelling. ROBETTA server was used for the 3D structure prediction, followed by its validation PROCHECK, ERRAT and Verify 3D tools from SAVES 6.0 server. Of the 5 models predicted by ROBETTA, Model-2 was the ideal choice (Figure S2). The Ramachandran plots from PROCHECK showed 92.40% of residues in the most favoured regions and only 0.40% of residues in disallowed regions. ERRAT predicted an overall quality factor of 97.8528 and Verify 3D demonstrated the overall structure accuracy to be 84.85% (Table S4). The 3D structure of the vaccine construct was visualized using PyMol (Fig. 3).

Fig. 3
figure 3

Modeled 3D structure of the final vaccine construct

Conformational B-cell epitope prediction

Conformational/Discontinuous epitopes mimic the properties of a linear epitope and can be used as a replacement for antibody production. Potential conformational epitopes were predicted using the ElliPro tool, yielding 5 conformational B-cell epitopes (Table S5). The predicted epitopes were ranked based on their Protrusion Index (PI) scores. The 3D structure of the top-ranking epitope consisting of 76 residues was visualized using jmol – a molecular viewer integrated within ElliPro (Fig. 4).

Fig. 4
figure 4

Predicted Discontinuous B-cell epitope with the best score

Vaccine construct-TLR4 docking studies

The envelope glycoproteins of HHV5 and HHV6 have been shown to effectively be recognized by TLR4 [51]. To determine this interaction, protein–protein docking between the vaccine construct and TLR4 (PDB ID: 3FXI) was performed using ClusPro 2.0 server. From among the resulting models, the one with the largest cluster (52) and least energy (-1084.8) was selected. The larger the cluster size, the greater the certainty of the cluster’s centre representing a putative model of the complex. Figure 5 illustrates the 3D structure of the docked complex. Further, the molecular interactions was visualized using the PDBSum. From Fig. 6, it can be observed that chains A and B from TLR4 interacted with the designed vaccine. There were 2 hydrogen bonds between chain B and the vaccine and a total of 13 hydrogen bonds between the residues of chain A and the vaccine. These interactions suggest that our proposed vaccine has the ability to be recognized by the T cell receptors.

Fig. 5
figure 5

3D visualisation of the vaccine-TLR4 docked complex

Fig. 6
figure 6

Molecular interactions between chains A, B of TLR-4 and the vaccine construct

Immune Simulation

The effectiveness of the vaccine construct to elicit an appreciable response from the adaptive immune system was determined using the C-Immsim server. Graphs of antibody, cytokine and lymphocyte concentrations were generated (Fig. 7). The antibody titre (IgG + IgM) rises after every dose and is the highest after the third dose. A simultaneous decrease in the antigen concentration was also observed after successive injections. Significant spikes in the B-cell and T-cell concentrations were also evident. Higher IgM concentrations ensure B-cell activation and the subsequent production of antibodies required for the secondary and tertiary immune responses. The vaccine has also been observed to induce high levels of IL-2 which has an immunoregulatory role and INF-g which is involved in APC activation.

Fig. 7
figure 7

Immune simulation analysis (a) The antigen, the immunoglobulins and the immuno-complexes (b) Concentration of cytokines and interleukins

Codon optimization and in-silico cloning

The Java Codon adaptation tool (JCat) yielded an improved codon sequence for the vaccine construct sequence, for ensuring efficacious expression in E.coli K12. The improved sequence showed a GC content of 56.46% and a Codon Adaptation Index (CAI) of 0.9975 signifying sequence stability and optimized expression in the host, respectively. The improved codon sequence was cloned between HpaI (1629) and SmaI (4300) of pET-28a ( +) which is a well-known expression vector resulting in a cloned vector of 4331 bp (Fig. 8).

Fig. 8
figure 8

In silico cloning of the designed vaccine construct

Conclusion

In the current investigation, a multi-epitope vaccine construct was generated using highly antigenic and immunogenic B-cell and T-cell epitopes predicted from the envelope glycoproteins of HHV-5 and HHV-6. Linkers such as KK, AAY and GPGPG were utilized to join the epitopes and an adjuvant was attached to boost its immunogenicity. The secondary and tertiary structures of the vaccine construct were predicted along with their physicochemical properties. The helix-rich nature of the construct along with metrics such as the instability index and aliphatic index reflects the stability of the constructed vaccine. Taking into account the global polymorphism of the MHC class I and class II molecules, a world population coverage analysis was conducted, and the results indicated that our proposed vaccine would induce an immunological response in 99.95% of the world's population. Similarly, molecular docking with the immune receptor TLR4 was performed and the results signifies the high binding affinity and stability of the vaccine-receptor complex. The immune simulation study showed a significant production of antibodies and cytokines after injecting vaccine. Finally, the nucleotide sequence for the vaccine construct was determined and the codons were adapted to be efficiently expressed in a bacterial host. In conclusion, the constructed vaccine has the potential to be translated into clinical practice to overcome the burden of HHV-5 and HHV-6 co-infection. The experimental studies on vaccine's efficacy in animal models is an interesting future direction and of very much importance to validate this findings.