Anti-COVID-19 multi-epitope vaccine designs employing global viral genome sequences

Background The coronavirus SARS-CoV-2 is a member of the Coronaviridae family that has caused a global public health emergency. Currently, there is no approved treatment or vaccine available against it. The current study aimed to cover the diversity of SARS-CoV-2 strains reported from all over the world and to design a broad-spectrum multi-epitope vaccine using an immunoinformatics approach. Methods For this purpose, all available complete genomes were retrieved from GISAID and NGDC followed by genome multiple alignments to develop a global consensus sequence to compare with the reference genome. Fortunately, comparative genomics and phylogeny revealed a significantly high level of conservation between the viral strains. All the Open Reading Frames (ORFs) of the reference sequence NC_045512.2 were subjected to epitope mapping using CTLpred and HLApred, respectively. The predicted CTL epitopes were then screened for antigenicity, immunogenicity and strong binding affinity with HLA superfamily alleles. HTL predicted epitopes were screened for antigenicity, interferon induction potential, overlapping B cell epitopes and strong HLA DR binding potential. The shortlisted epitopes were arranged into two multi-epitope sequences, Cov-I-Vac and Cov-II-Vac, and molecular docking was performed with Toll-Like Receptor 8 (TLR8). Results The designed multi-epitopes were found to be antigenic and non-allergenic. Both multi-epitopes were stable and predicted to be soluble in an Escherichia coli expression system. The molecular docking with TLR8 also demonstrated that they have a strong binding affinity and immunogenic potential. These in silico analyses suggest that the proposed multi-epitope vaccine can effectively evoke an immune response.


INTRODUCTION
COVID-19 has been declared a pandemic by the World Health Organization (WHO) as of the 11 March 2020 (WHO, 2020). The outbreak of SARS-CoV-2 infection starting in Wuhan, China has now affected millions of lives across the world. The common symptoms patients exhibit include fatigue, fever, dry cough, shortness of breath and upper airway congestion (Chan, Wong & Tang, 2020). The delay in symptom appearance due to the incubation period of the virus has caused more harm as infected people without symptoms might have already infected many others even before knowing they are carrying the infection. The number of infected people is increasing daily. Scientists are working tirelessly to find a cure, but as of now, the impasse continues.
Coronaviruses (CoVs) are a family of viruses causing diseases ranging from the common cold to more severe illnesses such as Severe Acute Respiratory Syndrome (SARS, caused by SARS-CoV) and Middle East Respiratory Syndrome (MERS, caused by MERS-CoV) (Ceraolo & Giorgi, 2020b;Sahin, 2020). Belonging to the family of Betacoronavirus, the latest CoV, recently named as SARS-CoV-2, is an enveloped spherical virion about 60-140 nm in diameter. It has a positive-sense single-stranded RNA genome of~30 kb. The virus is believed to have originated from bats and is phylogenetically linked closest to the bat-SL-CoVZC45 and bat-SL-CoVZXC21 (Zhu et al., 2020). The virus is highly contagious and spreads from person-to-person through the respiratory droplets when an infected person sneezes or coughs or when a person encounters contaminated surfaces with the virus. The disease state caused by viral infection has been termed as COVID-19 (Chan, Wong & Tang, 2020).
Considering the gravity of the situation of COVID-19 worldwide, there is a dire need for the development of an effective vaccine or antiviral drugs that are effective in treating this viral infection. Currently, there is no FDA approved treatment or licensed vaccine against COVID-19. Vaccination is the most desired option available at our disposal to eradicate the diseases associated with infectious microorganisms (Greenwood, 2014). Reverse vaccinology is a valuable technique that adopts genome-level analysis to identify potential antigenic determinants of the pathogenic microorganism (Rappuoli, 2001). Previously designed multi-epitope vaccines against various emerging viruses have been reported as safe and generate potent immune responses.
To elicit strong immunogenic responses and have long-term efficacy, vaccines against coronavirus must activate both humoral and cell-mediated immune responses (Amanna & Slifka, 2011). Virus-specific T cells, the effector T cells, provide vaccine-mediated protection at the peak of antiviral response and the antiviral cytokines production is also increased (Sallusto, Geginat & Lanzavecchia, 2004), but a successful vaccine would activate B-cell mediated humoral immunity as well. In the case of re-infection, memory B cells can be re-activated via B cell receptors and provide protection (Amanna, Carlson & Slifka, 2007). Therefore, it is a challenge to activate both humoral as well as cell mediated immunity of required levels to effectively provide protection against  In this study, we have utilized the 475 genomes of SARS-CoV-2 available in NGDC (available at: https://bigd.big.ac.cn/) and GISAID (available at https://www.epicov.org/ epi3/cfrontend) database as of 15 March 2020 to identify conserved immunogenic epitopes representing the novel coronavirus strains (collection) and to design a potent multi-epitope vaccine against the COVID-19 infection.

MATERIALS AND METHODS
Retrieval of sequences and multiple alignment GISAID (available at https://www.epicov.org/epi3/cfrontend) and NGDC databases were mined to retrieve updated complete genomes of SARS-CoV-2. The retrieved genomes were then aligned using the Multiple Alignment using Fast Fourier Transform tool (MAFFT) (available at: https://www.ebi.ac.uk/Tools/msa/mafft/) (Katoh et al., 2002). Default parameters were used encompassing 200 PAM/κ = 2 scoring matrix and a gap penalty of 1.53. These aligned sequences were visualized and analyzed in Unipro UGENE v.34 software (Okonechnikov et al., 2012) to generate a consensus sequence of all genomes and later, nBLAST was performed to analyze its diversity with Reference sequence NC_045512.2. The ORFs from this sequence were retrieved and subjected to further analysis of predicted proteins.
(available at: http://tools.iedb.org/immunogenicity/) (Calis et al., 2013). This tool works best for 9-mer epitopes but larger lengths can be employed too. Only those epitopes that showed a strong binding affinity with HLA super family alleles were shortlisted. Default threshold was used for prediction of results. Epitopes having percentage rank less than 0.5 were qualified as strong binders.
Strong binding affinity with HLA alleles and population coverage analysis CTL Epitopes that were both antigenic and immunogenic were screened for their binding affinity with HLA superfamily alleles using Net MHC 4.0 (available at: http://www.cbs.dtu. dk/services/NetMHC/) (Nielsen et al., 2003;Andreatta & Nielsen, 2016). The tool predicts results on the basis of an ANN that has been trained using 81 different HLA alleles. MHC binding affinity of shortlisted epitopes was checked against HLA superfamily alleles (i.e., HLA-A0101, HLA-A0201, HLA-A0301, HLA-A2402, HLA-A2601, HLA-B0702, HLA-B0801, HLA-B2705, HLA-B3901, HLA-B4001, HLA-B5801, HLA-B1501). HLA supertype representative are clusters of HLA alleles based on their overlapping peptide binding specificity, hence can cover 80% of population in a less computationally exhaustive environment. The concept has been effectively utilized in prediction of promiscuously recognized epitopes against SARS, HCV, vaccinia virus and HBV (Frahm et al., 2007;Sidney et al., 2008). Furthermore, population coverage analysis was performed using IEDB resource server. The tool predicts population coverage of epitopes based upon the restriction database (Bui et al., 2006).

HTL epitope mapping and population coverage analysis
The ORFs that didn't yielded prioritized CTL epitopes were subjected to Helper T Lymphocytes (HTL) epitope (CD4 T cell epitopes) prediction by the HLApred server. This was done to cover the whole proteome of the virus. All alleles of MHC-II were selected, and experimental as well as predicted binders were included in the analysis. A default threshold was used, and all predicted epitopes that shared no identity with humans were selected. The predicted epitopes were screened for their antigenic potential using VaxiJen as discussed earlier. The antigenic epitopes were further screened for their IFN-gamma induction potential using IFNepitope using hybrid approach that involves both SVM and Motif based approach. The prediction model chosen for analysis was IFN-gamma vs non IFN-gamma (available at: http://crdd.osdd.net/raghava/ifnepitope/scan.php) (Dhanda, Vir & Raghava, 2013). The epitopes screened positive for IFN-gamma were checked for overlapped B cell epitopes predicted by ABCpred (available at: http://crdd.osdd.net/ raghava/abcpred/) using a default threshold (Saha & Raghava, 2006b). Lastly, the shortlisted epitopes were checked for their strong binding affinity with all the HLA DR alleles present in Net MHCII server (available at: http://www.cbs.dtu.dk/services/ NetMHCII/). These alleles were DRB1_0101, DRB1_0103, DRB1_0301, DRB1_0401,  DRB1_0402, DRB1_0403, DRB1_0404, DRB1_0405, DRB1_0701, DRB1_0801, DRB1_0802,  DRB1_0901, DRB1_1001, DRB1_1101, DRB1_1201, DRB1_1301, DRB1_1302,  DRB1_1501, DRB1_1602, DRB3_0101, DRB3_0202, DRB3_0301, DRB4_0101,  DRB4_0103, DRB5_0101. HLA-DR alleles are present on the surface of antigen-presenting  cells and present the epitopes to helper T-cells initiating an immune response  (Smith -Garvin, Koretzky & Jordan, 2009). HLA DRB alleles were reported to be involved in viral clearance, hence all the available HLA DRB alleles in server were selected for analysis (Barrett, Ryan & Crowe, 1999;Keicho et al., 2009). Lastly, Population coverage analysis was performed using IEDB population coverage analysis tool.

Multi-epitope vaccine design and construction
Epitopes were arranged as per their arrangement in the reference sequence, the CTL epitopes were linked together using a flexible linker GGGGS as recommended by Chen, Zaro & Shen (2013), and HTL epitopes with GPGPG linker to construct a multi-epitope (Saadi, Karkhah & Nouri, 2017). Moreover, a second multi-epitope construct was also designed where β-defensin, an adjuvant, was added at the N terminus of multi-epitope with EAAAK linker (Shey et al., 2019).

Sequence-based physio-chemical properties of the multi-epitope design
Different physicochemical properties of multi-epitopes were predicted using ProtParam tool including prediction of in vivo half-life, theoretical isoelectric point (pI), molecular weight, instability index, aliphatic index and Grand Average of Hydropathy (GRAVY) index of peptides (Wilkins et al., 1999;Gasteiger et al., 2005). The degradation of peptides was based on amino acid at N-terminus hence N-end rule in Protparam was employed to predict in vivo half-life of both peptides. Instability index with a threshold of 40 was used in order to determine the stability of the protein in a cellular environment. The aliphatic and GRAVY index of both peptides was calculated based on their amino acid profiles.

Allergenicity of multi-epitope
Both multi-epitope constructs (with and without adjuvant) were screened for their allergenicity using two online servers; AlgPred (available at: http://crdd.osdd.net/raghava/ algpred/) (Saha & Raghava, 2006a) and Allergen FP (available at: http://ddg-pharmfac. net/AllergenFP/) (Dimitrov et al., 2014). In AlgPred, all four approaches were used including mapping of IgE epitopes and PID approach (allergenicity prediction is based on similarity with experimental IgE epitopes), BLAST search on allergen representative peptides (makes prediction by performing BLAST search against 2890 allergen representative peptides), MEME/MAST motif approach (predicts allergenic motifs using MAST) and hybrid approach. Allergen FP predicts the allergenicity of peptide using an alignment-free method by implementing a four-step algorithm and compared by Tanimoto coefficient.

Antigenicity of multi-epitope design and modeling
The antigenicity profiles of both multi-epitope constructs were predicted again to confirm if the epitopes were antigenic or not. We used two servers this time; VaxiJen (available at: http://www.ddg-pharmfac.net/vaxijen/) (Doytchinova & Flower, 2007) and AntigenPro (Magnan et al., 2010). Scratch Protein Predictor server available at http://scratch.proteomics.ics.uci.edu/ hosted the tool Antigen Pro for antigenicity estimation. AntigenPro, like VaxiJen, predicts antigenicity using physicochemical properties, irrespective of the length of peptides as well as its alignment.

Secondary structure prediction
For evaluation of further structural characteristics of designed multi-epitopes, PDBsum (available at http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/) (Laskowski et al., 2018) was used. PDBsum provides insight into the unique structural aspects of proteins, peptides and their ligands.

Tertiary structure prediction and refinement
Tertiary structures of both vaccines were predicted by Scratch Protein Predictor-hosted 3Dpro tool (available at http://scratch.proteomics.ics.uci.edu/) (Cheng et al., 2005) and refined by Galaxy refine (available at http://galaxy.seoklab.org/cgi-bin/submit.cgi? type=REFINE) (Ko et al., 2012). The model showing highest Ramachandran favored residues and minimum poor rotamers were selected. The refined structures were checked for their stability and flexibility using molecular dynamics (MD) simulation studies. Structural flexibility of a protein/peptide is important for its molecular recognition and its function, so a coarse-grained protein model implemented in a webserver CABS-Flex 2.0 (available at: http://biocomp.chem.uw.edu.pl/CABSflex/) (Kuriata et al., 2018) was used for near-native dynamics of both vaccines. Default distance restrained parameters were used, number of cycles and cycles between trajectory frames were raised to 100. Temperature of simulation was also kept default.

Molecular docking with Toll-Like Receptor 8
The models were then checked for their interaction with Toll-Like Receptor 8 (TLR8). Active and passive residues for these analyses were predicted using CPORT (De Vries & Bonvin, 2011). Molecular docking was performed by using guru level interface of HADDOCK 2.2 (available at: http://haddock.science.uu.nl/services/HADDOCK/ haddockserver-guru.html) (Van Zundert et al., 2016) with default parameters and a representative structure from the top-ranked docked cluster with minimum HADDOCK score was refined using Refinement Interface (available at: http://haddock.science.uu.nl/ services/HADDOCK/haddockserver-refinement.html) (Van Zundert et al., 2016). Moreover, NMA analysis of refined complexes was also performed using iMOD in order to determine their deformation potential (Kremer, Mastronarde & McIntosh, 1996) (available at: https://bio3d.colorado.edu/imod/paper/). The interacting residues between each complex were predicted using PDBsum analysis (Laskowski et al., 2018).

Codon optimization and in silico cloning in Escherichia coli
Codon optimization of both multi-epitopes was performed using JCAT (available at: http://www.jcat.de/) (Grote et al., 2005) and codons were adapted as per E. coli K12 strain.
Default parameters were used and plasmid UC19 was used for in silico cloning of both multi-epitopes. Moreover, a His6 tag was added at the both ends of sequences for the purification of multi-epitope proteins.

CTL epitope mapping
A total of 13 CTL epitopes were shortlisted for designing multi-epitopes. These shortlisted epitopes were predicted to be antigenic (as determined by VaxiJen tool), immunogenic (as determined by MHC-I IEDB Immunogenicity tool), and had potential to bind with HLA superfamily alleles (as predicted by NetMHC) (     Table S1.

HTL epitope mapping
A total of 10 HTL epitopes were scrutinized for multi-epitopes. These shortlisted epitopes were predicted to lack homology with Human proteome (as determined by HLApred), had overlapping B cell epitopes (as determined by ABCpred), showed potential to induce IFN-gamma response (as determined by IFNepitope server) and demonstrated strong binding affinity with HLA DR alleles (as predicted by NetMHCII) ( Table 2). All the epitopes had 89.70% population coverage (as predicted by IEDB population coverage analysis tool). The detail of step wise scrutiny and all the initially predicted epitopes is provided in Table S2.

Multi-epitope structure design and modeling
Two multi-epitope constructs were designed with and without β-defensin adjuvant to analyze and compare whether the designed multi-epitope is efficacious only in presence of adjuvant or has the ability to elicit immune response solely. Cov-I-Vac was the multi-epitope construct without adjuvant and Cov-II-Vac was with adjuvant (Fig. 2).  Note: CD4 T cell epitopes having strong antigenicity, overlapping B cell epitopes and IFN induction potential that were screened to design multi epitopes. Antigenicity, allergenicity and immunogenicity of Cov-I-Vac and Cov-II-Vac The antigenicity potential was found to be retained in both multi-epitopes and was estimated as 0.6153 and 0.5947 for Cov-I-Vac and Cov-II-Vac, respectively. Both constructs were reported as antigenic by AntigenPro as well with an antigenicity score of 0.516406 and 0.386605. Moreover, both vaccines were soluble upon overexpression in E. coli. The safety profile also showed that constructs do not exhibit any experimentally reported allergens and thus projected to be non-allergens.

Secondary structure of Cov-I-Vac and Cov-II-Vac
The secondary structure prediction of Cov-I-Vac shows that it has a sheet and beta hairpin between two strands (Ala160-Il161, Gly164-Gly165), six helices, 42 beta turns and two gamma turns (Fig. 3A). The secondary structure of Cov-II-Vac shows that it has a sheet and two beta hairpins among three strands (Val193-Thr198, Gly201-Val208, Gly216-Gly217), 10 helices, five helix-helix interactions, and 37 beta turns (Fig. 3B).

Tertiary structure prediction and refinement
The tertiary structure of Cov-I-Vac had 10.2% poor rotamers with 90.5% Rama favored residues; however, the refined structure selected for further analysis had an RMSD of 0.541 with 0% poor rotamers and 93% Rama favored residues. Similarly, the initial structure of Cov-II-Vac had 5.9% poor rotamers with 89% Rama favored residues, but the refined shortlisted structure with RMSD 0.548 had 92.9% Rama favored residues and 0% poor rotamers.

CABS-flex analysis
The trajectory of 10 models was generated after simulation (Fig. 4). The Root Mean Square Fluctuation (RMSF) of this simulation was found to be between 3.9 and 5.8 Å.

The trajectory of Cov-II-Vac after fast simulation
Trajectory analysis of fast simulation of Cov-II-Vac was analyzed and the RMSF of the structure was found to be 0.75-7.25 Å. The maximum fluctuation was observed at residue 227 while minimum fluctuation was residue 39-59 (Fig. 5).
The active and passive residues of Cov-I-Vac, Cov-II-Vac and TLR8 are provided in Table S3. The molecular docking of Cov-I-Vac with TLR8 resulted in the top-ranked cluster with a HADDOCK score of −40.8 +/− 5.5. Low values of HADDOCK score, especially those that are negative, usually imply significantly high interaction between proteins. HADDOCK refined the representative structure of this complex and clustered the resulting 20 structures into one complex that represented 100% of the refined water models generated. After undergoing molecular refinements, the HADDOCK score of the Cov-I-Vac-TLR8 complex improved significantly to reach −251.3 +/− 2.3. The details of the refined interaction along with statistical parameters are provided in Table 3. Cov-II-Vac also showed significant interaction with TLR8 that was −83.5 +/− 6.3 which was a higher interaction score than Cov-I-Vac. However, after refinements, the interaction of Cov-II-Vac with TLR8 was lower compared to Cov-I-Vac-TLR8 interaction, that is, −217.4 +/− 3.5. The details of HADDOCK score with all its parameters are mentioned in Table 3. For an in depth analysis of protein-protein interactions, PDBSum analysis revealed that a total of 36 residues of Cov-I-Vac interact with 43 residues of TLR8, with one salt bridge, 15 hydrogen bonds and 210 non bonded interactions. The detail of the interaction is illustrated in Fig. 6 and atomic level interaction is provided in Table S4. Similarly, 36 residues of Cov-II-Vac interacted with 41 residues of TLR8, prominent among them were residues of multi-epitopes of 12 hydrogen bonds and 168 non bonded contacts. The interacting residues of the complex are illustrated in Fig. 7 while distances of detailed atomic interaction are provided in Table S5.   Codon optimization and E. coli expression Codon optimization results of both vaccines Cov-I-Vac and Cov-II-Vac (with GC content 58.99% and 58.49%, respectively) were within the optimum limits . The CAI value was predicted to be 1.0, which projects high level expression of our designed vaccine constructs in E. coli K12 strain. The total length of the Cov-I-Vac clone was 3.6 kbp while Cov-II-Vac clone was 3.8 kbp. Both sequences were designed to be added to the plasmid between restriction sites AfiIII-pciI and BspQI-sapI (Fig. 9).

DISCUSSION
The COVID-19 pandemic has affected 213 countries around the world with almost 6.3 million patients and more than 378 thousand deaths (Worldometersinfo, 2020). The current situation urges scientists all over the world to find an urgent solution to stop this pandemic and develop effective therapeutics (WHO, 2020). To ensure viral clearance, cell mediated and humoral responses must be induced by the action of CD8 and CD4 T cells (Ikram et al., 2018). The reliability of such responses has enabled the use immunoinformatics approaches for vaccine development against viral diseases. The present study encompassing 475 complete genomes from all around the world suggests that while there is emerging variation in SARS-CoV-2, the rate with which this variation arises is quite slow (Ceraolo & Giorgi, 2020a). The consensus sequence generated from the alignment of all complete genomes was found to be 99% identical to the reference sequence reported from Wuhan. The phylogenetic analysis of 475 genomes of SARS-CoV-2 reveals that genome sequences reported from China belong to clades containing those from almost all of the countries of the world and covering the whole span of a phylogenetic tree. The relatedness of SARS-Cov-2 strains was expected considering that the outbreak originated from China (Nezafat et al., 2016;Mackenzie & Smith, 2020;Wu et al., 2020a).
The present study was mainly conducted to propose multi-epitope based vaccines against SARS-CoV-2 by an immunoinformatics approach similar to other studies (Tomar & De, 2010;Dar et al., 2016;Ikram et al., 2018). These vaccines are advantageous compared to other monovalent vaccines because they provide superior protection by eliciting strong humoral and cell mediated immunity (Amanna & Slifka, 2011). Three subunit vaccines have been previously reported that were based on one, three, and 10 proteins (Abdelmageed et al., 2020;ul Qamar et al., 2020).
In recent years, epitope-based vaccines, developed via immunoinformatics approach, have gained global recognition as an alternate therapeutic strategy because of the various advantages they possess. A multi-epitope vaccine was designed against MERS, a virus belonging to the same family of viruses as SARS-CoV-2, using in silico tools (Srivastava et al., 2018). Another multi-epitope peptide vaccine was designed against brucellosis which elicited strong T cell responses (Saadi, Karkhah & Nouri, 2017). Shey et al. (2019) reported a similar approach used for prophylactic/therapeutic multi-epitope vaccine design for Onchocerciasis caused by microflaria. Promising results were shown in mice models by a multi-epitope vaccine developed by Lin et al. (2016) against Epstein-Barr virus, designed using bioinformatics tools. These studies highlight the importance of multi-epitope vaccine design to combat infections, and validate the methodology adopted in our study.
The vaccine designed in our study used all predicted ORFs and therefore covers the complete proteome. Epitopes selected were 13 CTL based and 10 HTL based that elicit cell mediated and humoral immune responses, respectively. Therefore, our subunit vaccines are projected to be effective and immunogenic against SARS-Cov-2. Coding sequences of the proteins of SARS-CoV-2 from NCBI were evaluated for their antigenicity. Highly antigenic sequences were used to predict B and T cell epitopes, the key step in vaccine development. Antigenic T cell epitopes were screened to overlap with B-cell and IFN-gamma epitopes (Van Regenmortel, 1996). HTL epitopes overlapping with B cell epitopes were further proceeded for multi-epitope construction. HTL epitopes overlapped with B-cell epitopes have been used in vaccine design studies to generate humoral response (Lehtinen et al., 1995;Dar et al., 2019). Cellular immunity mediated by CTL and humoral immunity mediated by HTL epitopes can aid viral clearance from body. Subunit vaccines proposed against HIV using HTL and CTL epitopes are under clinical trials (Newman et al., 2002).
The multi epitope vaccines, proposed in this study, were designed with the help of linkers; epitopes were linked with GGGGS (CTL based) and GPGPG (HTL based) linkers. Linkers were added to help maintain the proper functioning of each epitope after being imported to the human body independently . GGGGS linker was added between CTL epitopes to ensure flexibility as well as solubility and pH stability (Chen et al., 2017). Addition of the GPGPG linker induces strong HTL responses as it not only facilitates the processing and presentation of epitopes but also prevents the formation of junctional epitopes in order to maintain immunogenicity (Ikram et al., 2018;Wu et al., 2020b).
An adjuvant, β-Defensin was added to the N-terminus of Cov-I-Vac to make it Cov-II-Vac, using EAAAK linker, in order to increase the immunogenicity of the vaccine. N-terminus amino acid plays a role in ubiquitin mediated degradation (Wilkins et al., 1999). The EAAAK linker effectively separates the functional domains and decreases the chances of interaction among them (Arai et al., 2001). It also increases the bioactivity and expression of epitopes (Bai & Shen, 2006;Amet, Lee & Shen, 2009).
β-defensins are effector antimicrobial peptides having antimicrobial activity including antiviral, antibacterial and antifungal activity (Lehrer & Lu, 2012). They recruits immature dendritic cells and naïve T-cell at the site of an infection (Allaker, 2008) and use both Th1 and Th2 dependent pathways to boost the synthesis of antigen-specific immunoglobulins (Tani et al., 2000). Kim et al. (2018) demonstrated that the use of human β-defensins can initiate both innate and adaptive immune response against the conjugated antigen. Several studies suggest that β-defensins efficiently induce a prolonged humoral as well as cellular immune response against a pathogen (Allaker, 2008) which makes them a potent antiviral vaccine adjuvant.
The two multi-epitopes Cov-I-Vac and Cov-II-Vac are 317 and 367 amino acids long respectively with a molecular mass of 31.6 kDa and 37.2 kDa, respectively. Both predicted proteins are basic according to theoretical pI values. The instability index and aliphatic index scores calculated using in silico analysis indicated that the vaccine proteins can prove to be stable and thermostable in a test tube, if tested experimentally. The GRAVY score for our vaccines was positive and suggests that both are hydrophobic, which may necessitate the use of micelles for better interaction of the vaccine protein within the polar environment of the body . The efficacy of subunit vaccines can be improved by addition of nanoparticles-based antigen delivery system as they have smaller size and an ability to deliver the antigens to antigen presenting cells in lymph nodes. Self-assembled micelles, in this regard, are adjuvant of choice for amphiphilic molecules. Historically, these systems have been used to encapsulate hydrophobic drugs in micelle core as reviewed by Trimaille and Verrier (Trimaille & Verrier, 2015).
Cov-I-Vac and Cov-II-Vac were found to be antigenic, immunogenic, non-allergenic and strong MHC binders. These factors suggest that the multi-epitopic vaccines will have a robust immune response without allergic reaction.
Molecular Dynamic Simulation studies are mainly conducted to check the stability of vaccine by simulating it in a cellular environment (Ikram et al., 2018). However, these simulation studies take longer time so fast, dynamic simulation, CABS-flex analyses were conducted in this study that give similar results like MD simulations performed at 10ns. Moreover, the RMSF obtained from CABS-flex analysis is comparable to the RMSF of NMR ensembles (Jamroz, Kolinski & Kmiecik, 2014).
The efficacy of a multi-epitope vaccine depends on the population for which the vaccine is prepared. The human CTL and HTL based T cell epitopes included in our multi-epitope vaccines were 99.29% similar to those of the world population, indicating that it holds great potential to be an effective approach to combat the ongoing pandemic. In order to induce effective immunological responses in the body, the strong binding of the vaccine products with immune receptors is necessary. The molecular docking analysis of our designed multi-epitope based vaccine showed significant interaction with an immune receptor, TLR8, further predicting the efficacy of both vaccines. However, the vaccine Cov-I-Vac showed a stronger interaction with TLR8. The docking scores indicate that the epitopes predicted in this study have higher binding affinity with TLR8 and thus, have the ability to elicit an effective immune response. TLR8 recognizes the single stranded RNA and acts as a major switch to elicit protective immune responses. It initiates antiviral responses via MyD88-dependent pathway leading to the production of type I and III interferons and other pro-inflammatory cytokines via NF-κB activation (such as IL-1, TNF-a, IL-6) (Lester & Li, 2014;Prompetchara, Ketloy & Palaga, 2020). These immune responses are otherwise suppressed in case of novel SARS-CoV-2 infection, therefore, interaction of predicted vaccine with TLR8 would produce a diverse set of immune mediators leading to initiation of a protective immune response and result in viral clearance, ideally (Felsenstein et al., 2020).
The CAI value after in silico cloning in vector suggests that translational efficacy, mRNA codons, of both multi-epitope vaccines is compatible with the host system. Moreover, it also indicates that synthesis of both vaccines is experimentally possible with higher expression within the E. coli K-12 system. The advantages of using pUC19 plasmid for cloning of Cov-I-Vac and Cov-II-Vac are high copy number, smaller sized plasmid and color differentiation upon over expression (color of recombinant colonies appears white while non-recombinant ones appear blue) (Yanisch-Perron, Vieira & Messing, 1985). The in silico cloned multi-epitopes can be used for experimental validation of these results. In silico cloning of vaccines in vectors has also been performed in studies suggested by Dar et al. (2019).
We believe that immunoinformatics approaches are useful to design effective vaccine candidates that when tested in experimental studies yield proposed outcomes (Ikram et al., 2018). All the ORFs of SARS-CoV-2 were subjected to epitope mapping; however, it is noteworthy to mention that no epitope from 3′ UTRs had strong binding affinity with HLA super family alleles. Therefore, no epitopes from this region are part of either CovI-Vac and CovII-Vac vaccines. However, the multi-epitopes do contain strong HLA superfamily allele binding regions from structural as well as non-structural proteins. These structural, as well as non-structural proteins, play an important part in viral assembly and attachment and specific immune responses against them are required to combat the infection (Lorente et al., 2016;Astuti, 2020). Despite the numerous benefits associated with this computational study, experimental validation is required to verify these results.

CONCLUSIONS
The study was aimed to predict a vaccine against SARS-CoV-2 using immunoinformatics approaches. Two multi-epitopes were designed, one without adjuvant (Cov-I-Vac) and the other with β-defensin adjuvant (Cov-II-Vac). The in silico designed multi-epitopes Cov-I-Vac and Cov-II-Vac are predicted to be antigenic, immunogenic, thermostable and safe for human applications. The multi-epitopes cover the predicted proteome of SARS-CoV-2 (Fig. 2). Cov-I-Vac demonstrates better interaction with TLR8 and more energy is required to deform the structure as compared to Cov-II-Vac. Future prospects of the study involve experimental validation of these results.
Kanwal Naz conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. Zaara Ishaq performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Tahir Ahmad conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Nimat Ullah analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. Syeda Marriam Bakhtiar analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Syed Aun Muhammad analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Amjad Ali conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Data Availability
The following information was supplied regarding data availability: The raw measurements are available in the Supplemental Files.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.9541#supplemental-information.