Immunoinformatics Based Study of T Cell Epitopes in Zea m 1 Pollen Allergen

Background and Objectives: Zea m 1 is a pollen allergen, which is present in maize, is accountable for a type I hypersensitivity reaction in all over the world. Several effective medications are available for the disorder with various side effects. Design and verification of a peptide-based vaccine is a state-of-art technology which is more cost effective than conventional drugs. Materials and Methods: Using immunoinformatic methods, the T cell epitopes from the whole structure of this allergenic protein can be predicted. Worldwide conserved region study among the other pollen allergens has been performed for T cell predicted epitopes by using a conservancy tool. This analysis will help to identify completely conserved HLA (human leukocyte antigen) binding epitopes. Lastly, molecular docking study and MHC-oligopeptide complex binding energy calculation data are applied to determine the interacting amino acids and the affinity of the epitopes to the class II MHCmolecule. Results: The study of criteria-based analysis predicts the presence of two epitopes YVADDGDIV and WRMDTAKAL on this pollen allergen. Conclusions: The T cell epitopes identified in this study provide insight into a peptide-based vaccine for a type I hypersensitivity reaction induced by Zea m 1 grass pollen allergenic protein.


Introduction
An epitope is a portion of an antigenic protein molecule that can be identified by the immune system of the human body by either B or T lymphocyte cells. The peptides can be interacted with through the T-cell receptors of the immune system. After being bound to at least one Major Histocompatibility Complex (MHC) protein, the antigenic proteins are intracellularly processed and exposed on the surface of the antigen presenting cell, known as the MHC-peptide complex. Since T lymphocytes play an active role in this type of immune response along with the specific MHC molecule from the antigen presenting cell for antigen presentation, this type of epitopes is therefore called a T-cell epitope [1]. MHC molecules are cell surface glycoproteins, which actively contribute to host immune reactions. These MHC molecules are expressed from human HLA (Human Leukocyte Antigens) gene to participate in adaptive immune system in humans [2].
The immunogenicity of T cell epitope is dependent on various factors e.g., suitable and actual peptide processing from its protein source, stable small peptide binding with the MHC molecule and lastly, recognition of the MHC-bound peptide molecule by the T cell receptor [3]. Two types of MHC molecules known as MHC class I molecules normally present peptides containing 8 to 11 amino acids in length, but the peptides binding to MHC class II may have peptide lengths that vary from 12 to 25 amino acids [4]. MHC class II protein molecules bind with fragments of the oligopeptide, obtained from the proteolytic cleavage of antigenic protein and present them on the cell surface of antigen presenting cells (APCs), to be recognized by CD4+ T cells. When adequate amounts of the epitope are presented, the T cell may generate a specific adaptive immune response for that pathogen through the process of positive selection. Class II MHC molecules are secreted on specialized cells, e.g., professional APCs such as B cells, macrophages and dendritic cells, whereas class I MHCs are expressed on every nucleated cell of the human body [5].
In an ideal peptide-based vaccine, both B-and T-cell epitopes are present [2]. B cell and T cell epitopes are recognized by two different pathways. The three-dimensional conformation of the antigenic protein molecule is wholly responsible for recognition by B cells, whereas T cells can recognize an antigen protein molecule only after it has been digested to form a small peptide fragment which must be bound with the major histocompatibility complex (MHC) class II molecule, forming a ternary complex.
Allergic symptoms are one of the most common health problems in the world. Among the four types of hypersensitivity reactions, more than 25% of the world's population suffers from a type I hypersensitivity reaction. Considering the various causes of allergic reactions, pollen allergens are predicted to be the most potential source of hypersensitivity reaction. Pollinosis due to various pollens results in allergic rhinitis and asthma [6]. Grass pollen allergens from Cynedondactylon, Orzya sativa, Zea mays etc. are accountable for the allergic reaction in susceptible individuals in different parts of the world [7]. Allergic diseases can be successfully treated by identifying clinically important allergens. Worldwide, about 400 million people are suffering from hay fever as well as seasonal asthma [8]. The major causative biomolecules for this allergy are pollen proteins also known as the group-1 grass pollen allergens [9]. Zea m 1 is found in a class of abundant grass pollen allergens, which are formed by several genes. These proteins can loosen the walls of grass cells, including the maize stigma and style [10]. In a study, performed in Portugal on thirty-two children that are under 8 years of age, all are given a positive skin-prick test for grass pollen allergens [11]. 38% of the children are monosensitized to different grass pollen allergens. The decreasing order of sensitization frequency for that pollens are shown in Table 1. To prevent the type I hypersensitivity reaction, drugs known as mast cell stabilizers e.g., synthetic, semi synthetic and natural stabilizers, can be used as therapeutic agents. These mast cell stabilizers obtained from natural resources can be classified into different groups such as flavonoids, coumarins, phenols [12], terpenoids, alkaloids etc. Considering their mechanisms of action, mast cell stabilizers can be classified as Ca 2+ channel blocking agents, suppressors of gene expression of genes (tumor necrosis factor alpha, different interleukins), inhibitors for phosphorylation reactions in MAPK, ERK, JNK Gab2 signaling pathways, down regulators of enzyme histidine dicarboxylase, suppressors of mRNA of CD23, inhibitors of COX2/5 lipooxygenase enzyme, inhibitors of prostaglandin D 2 synthesis, LTC 4 inhibitors, and Spleen tyrosin kinase enzyme inhibitors [13]. Explorations for alternative types of therapeutics in allergic reactions are explained in my earlier work where, a specific sense siRNA is explored as anti-allergic therapeutic during an immediate type of hypersensitivity reaction, caused by the Zea m 1 pollen allergenic protein [14]. Presently, in allergen-specific immunotherapy (AIT), the disease-causing allergens are used for a disease-modifying treatment of allergy. The molecular allergen characterization process is applied, to produce allergy vaccines with the recombinant allergens, peptides and genes synthesizing allergens. The B-cell epitope technique is also another promising method used to identify the antigenic determinants or epitopes present in the antigenic proteins [15]. Epitope based vaccine design is a state-of-art method, because it is very specific, is able to evade undesirable immune reactions, has the power to create long lasting immunity, and at the same time, it is cheap in price. This method has been applied to treat various diseases like tuberculosis [1], Nipah virus infection [16] etc. In this study, an epitope-based peptide vaccine design method is studied for Zea m 1 pollen allergen, with various T cell epitope prediction methods, followed by molecular docking technique. Prediction methods and docking experiments are performed to design peptide-based vaccines for an allergic reaction caused by Zea m 1 pollen allergen.

Materials and Methods
Different steps and computational methods applied to forecast T cell epitopes for preparing peptide-based vaccine for Zea m 1 pollen allergen areshown in Figure A1.

Retrieval of Zea m1 Pollen Allergen Protein in FASTA Format
The amino acid sequence and three-dimensional structure of Zea m 1 allergenic protein (PDB ID 2HCZ) are obtained from UniProt knowledgebase [17].

MHC II Binding Epitope Prediction for Allergenic Protein
The MHC binding epitope prediction methods for Zea m 1 allergen can be classified into three groups such as (i) methods based on protein motifs, (ii) expression-based methods using statistics or mathematics and, (iii) methods based on structure of the allergen.

Motif Based Methods
SYFPEITHIA DATABASE OF MHC LIGANDS AND PEPTIDE MOTIFS (Ver. 1.0) (http://www. syfpeithi.de/), is a database which comprises more than 7000 peptide sequences, known to bind class I and class II MHC molecules. Using FASTA sequence of Zea m 1 pollen allergen, epitopes for MHC II binding are searched.

IEDB Recommended Method
The IEDB recommended method (www.iedb.org) is used to identify a T cell epitope, in which the Consensus approach, in combination with NN-align, SMM-align, CombLib and Sturniolo algorithms are applied. Here a NetMHCIIpan method is also used.

A Proteochemometrics Based Method
EpiTOP, a proteochemometrics based model theoretically predicts peptide binding to a whole group of MHC proteins (http://www.pharmfac.net/EpiTOP). This method helps to detect T cell epitopes on the basis of mathematical expression.

Specificity-Determining Residue (SDR) Concept
PREDIVAC, a method based [18] on the specificity-determining residue (SDR) concept which covers 95% of MHC class II allelic variants. SDRs consist of a trivial set of structurally conserved locations in the peptide-binding interaction interface that are responsible for specific recognition of MHC II molecules. Peptide binding prediction to the HLA class II protein DRB3*0101 is executed by parsing the query protein sequence into overlapping nonameric segments (peptides), each of which is assigned a Predivac binding score (0-100).

A Method is based on the QM (Quantitative Matrices) Approach
ProPred, a method is grounded on the QM (quantitative matrices) approach. It predicts binders for MHC class II molecules (http://www.imtech.res.in/raghava/propred/). This matrices-based method is also applied.
A MethodApplying Position Specific Scoring Matrices (PSSMs) RANKPEP, a method (http://imed.med.ucm.es/Tools/rankpep.html) which forecasts peptide binders with the MHCII molecules from protein amino acid sequence/s or sequence alignments using Position Specific Scoring Matrices (PSSMs). Using this tool, MHC II binding epitopes of Zea m 1 are identified.

Structure Based Prediction Method
Structure based methods are based on the molecular docking technique. These methods compute binding energy between peptide and MHC molecule and the energetically favorable peptides are predicted as binders. A flowchart for molecular docking procedure is shown in Figure A2.

Population Coverage Prediction of Putative Epitopes
The following putative epitopes and their cumulative predicted coverage are calculated specifically for the set of HLA class II allelic variants occurring in the target population of Asia, according to allele frequency data recovered from the Allele Frequency Net Database (http://www.allelefrequencies. net/) [19].

Analysis for the Effectiveness of Peptide-Based Vaccine in Other Group 1 Grass Pollen Allergens
To prove the effectiveness of these two peptides as the vaccines for whole group 1 grass pollen allergens, a search is performed to identify homologous allergens in the SDAP allergens database [20]. SDAP (Structural Database of Allergenic Proteins) is a web server [20] that delivers quick access to the peptide sequences, three-dimensional structures and IgE epitopes of allergenic proteins. The database component of SDAP comprises information about the name, source, sequence, structure, IgE epitopes and literature references for allergens and easy links to the major protein from various web browsers, such as-PDB, SWISS-PROT/TrEMBL, PIR-ALN, NCBI Taxonomy Browser, as well as from literature e.g., PubMed, MEDLINE.

Retrieved Sequence of Zea m1 Pollen Allergen Protein in FASTA Format
An X ray crystallographic structure of Zea m 1 (PDB ID 2HCZ) is shown in Figure 1 [17]. This allergenic protein structure is used to identify a predicted T cell epitope for peptide mapping to design a vaccine. Medicina2019, 55, x FOR PEER REVIEW 5 of 14

Epitope Search Results for Motif-Based Methods
Result from SYFPEITHI, a DATABASE OF MHC LIGANDS AND PEPTIDE MOTIFS (Version 1.0), for prediction of CD4+ T cell epitope is shown in Table 2.  The peptides VKVKYVADDGDIVLM and LSWGAIWRMDTAKAL are identified as predicted T cell epitopes for this antigenic protein from SYFPEITHI database and their prediction scores for various MHC II allelic proteins are shown in Table 2.

Results from IEDB Recommendation Method
The prediction method recommended by IEDB for MHC-II binding with CD4+ T cell epitope propose that the lower the percentile rank is for the epitope, the better it would be as a binder of the MHC II molecule. The predicted percentile rank using consensus and NetMHCIIpan methods for predicted T cell epitopes are shown in Table 3.

Epitope Search Results for Motif-Based Methods
Result from SYFPEITHI, a DATABASE OF MHC LIGANDS AND PEPTIDE MOTIFS (Version 1.0), for prediction of CD4+ T cell epitope is shown in Table 2.  The peptides VKVKYVADDGDIVLM and LSWGAIWRMDTAKAL are identified as predicted T cell epitopes for this antigenic protein from SYFPEITHI database and their prediction scores for various MHC II allelic proteins are shown in Table 2.

Results from IEDB Recommendation Method
The prediction method recommended by IEDB for MHC-II binding with CD4+ T cell epitope propose that the lower the percentile rank is for the epitope, the better it would be as a binder of the MHC II molecule. The predicted percentile rank using consensus and NetMHCIIpan methods for predicted T cell epitopes are shown in Table 3.   (Table 3). So, this allelic protein is used for the Predivac method to predict the nanomers as T cell epitopes with predicted scores (Table 5).

Results for T Cell Epitope Prediction Using PROPRED Method
Predicted peptides along with their positions and predicted score in pollen allergen protein are displayed in Table 6 by using the PROPRED method.

Results from the RANKPEP Method
Predicted peptide sequences along with their positions and predicted score in pollen allergen protein, Zea m 1 are displayed in Table 7 by using the RANKPEP method.

Structure Based T Cell Epitope Prediction by Using Molecular Docking Technique
From the above-mentioned methods two peptide sequences e.g. WRMDTAKAL and YVADDGDIV are selected as suitable T cell epitopes for the Zea m 1 allergenic protein. Similarly, a MHC II allele HLA-DRB3*01:01 protein is detected as the most probable interacting MHC molecule. For docking studies, both T cell epitopes are nominated and subjected to predict three-dimensional structures using a PEP-FOLD server [21,22] (Figure 2). The molecular interactions with specific HLA protein for respective epitopes are identified in docking studies with a ClusPro 2.2 web server [23]. Molecular docking complexes for two predicted peptide structures with MHC II allele HLA-DRB3*01:01 protein are shown in Figure 3. The estimated accuracy of docked structure for peptide YVADDGDIV is higher than that of WRMDTAKAL, as shown in Table 8.

Population Coverage Prediction for Putative Epitopes
The following putative epitopes and their cumulative predicted coverage are calculated specifically for the cluster of HLA class II gene allelic variants present in the Asian population. Population coverage prediction is estimated considering the allelic frequency data recovered from the Allele Frequency Net Database (Table 9). Molecular docking complexes for two predicted peptide structures with MHC II allele HLA-DRB3*01:01 protein are shown in Figure 3. The estimated accuracy of docked structure for peptide YVADDGDIV is higher than that of WRMDTAKAL, as shown in Table 8. Molecular docking complexes for two predicted peptide structures with MHC II allele HLA-DRB3*01:01 protein are shown in Figure 3. The estimated accuracy of docked structure for peptide YVADDGDIV is higher than that of WRMDTAKAL, as shown in Table 8.

Population Coverage Prediction for Putative Epitopes
The following putative epitopes and their cumulative predicted coverage are calculated specifically for the cluster of HLA class II gene allelic variants present in the Asian population. Population coverage prediction is estimated considering the allelic frequency data recovered from the Allele Frequency Net Database (Table 9).

Population Coverage Prediction for Putative Epitopes
The following putative epitopes and their cumulative predicted coverage are calculated specifically for the cluster of HLA class II gene allelic variants present in the Asian population. Population coverage prediction is estimated considering the allelic frequency data recovered from the Allele Frequency Net Database (Table 9). The prediction is carried out by considering 225 HLA class II proteins expressed in the target population.

Analysis for the Effectiveness of Peptide-Based Vaccine in Other Group 1 Grass Pollen Allergens
To prove the effectiveness of these two peptides as the vaccines for whole group 1 grass pollen allergens, a search is performed to identify homologous allergens, in the SDAP allergens database [20]. A list of allergens, using the FASTA alignments among the Zea m 1 sequence and all SDAP allergens having an E score value higher than 0.010000 are shown in Table 10. Among these 50 allergens, 14 grass allergens are selected, omitting isoallergenic proteins. The multiple alignment sequence study for these two T cell epitopes such as YVADDGDIV and WRMDTAKAL in 14 grass allergens, is shown in Figure 4. This study shows that the locations of amino acids present in these epitopes are almost conserved, so these two epitopes may be effective for all these 14 grass allergens after clinical verification. Among these 50 allergens, 14 grass allergens are selected, omitting isoallergenic proteins. The multiple alignment sequence study for these two T cell epitopes such as YVADDGDIV and WRMDTAKAL in 14 grass allergens, is shown in Figure 4. This study shows that the locations of amino acids present in these epitopes are almost conserved, so these two epitopes may be effective for all these 14 grass allergens after clinical verification.

Discussion
Allergens cause type I hypersensitivity reactions mediated by immunoglobulin E (IgE) molecule. IgE biosynthesis, also known as sensitization, may be caused by airborne allergens, food allergens, drug allergens and occupational allergens. Modern clinical therapeutics for allergic reactions includes a combination of patient awareness, allergenic molecule avoidance, pharmacotherapy, and allergy immunotherapy. Allergy immunotherapy is a type of treatment targeting the basic immunological molecules and immunological pathways involved in allergic reaction and resulting in the activation of immunological tolerance by reducing IgE molecule reactivity and retaining T cell molecule reactivity [24].A vast array of structurally altered allergens has been created, including allergenic oligopeptides, chemically modified allergoids, adjuvant-bound allergens, and nanoparticle-based allergy vaccines. In allergen-specific immunotherapy (AIT), repeated doses of sensitive allergens are used for desensitization or hypo-sensitization of allergic patients. Several herbal and natural products are reported to regulate antigen-IgE mediated allergic responses [25,26]. There are other types of immunotherapeutic strategies that have also reported which use idiotype and anti-idiotype antibody interaction in vaccine production [27] and epitope-paratope peptide modulation [28,29]. Different bioinformatic

Discussion
Allergens cause type I hypersensitivity reactions mediated by immunoglobulin E (IgE) molecule. IgE biosynthesis, also known as sensitization, may be caused by airborne allergens, food allergens, drug allergens and occupational allergens. Modern clinical therapeutics for allergic reactions includes a combination of patient awareness, allergenic molecule avoidance, pharmacotherapy, and allergy immunotherapy. Allergy immunotherapy is a type of treatment targeting the basic immunological molecules and immunological pathways involved in allergic reaction and resulting in the activation of immunological tolerance by reducing IgE molecule reactivity and retaining T cell molecule reactivity [24].A vast array of structurally altered allergens has been created, including allergenic oligopeptides, chemically modified allergoids, adjuvant-bound allergens, and nanoparticle-based allergy vaccines. In allergen-specific immunotherapy (AIT), repeated doses of sensitive allergens are used for desensitization or hypo-sensitization of allergic patients. Several herbal and natural products are reported to regulate antigen-IgE mediated allergic responses [25,26]. There are other types of immunotherapeutic strategies that have also reported which use idiotype and anti-idiotype antibody interaction in vaccine production [27] and epitope-paratope peptide modulation [28,29]. Different bioinformatic algorithms, as well as computational methods [30] are used for identifying biological functions of peptides.
Epitope-based vaccines are short oligopeptides derived from antigen that are used after antigen presentation to T-cells to prevent diseases like type I hypersensitivity. During antigen presentation, epitopes are bound with major histocompatibility complex (MHC) protein molecules. Peptide vaccines based on multiple T-cell epitopes can be administered for the rational use of immunogens among distinct ethnic populations, while providing numerous potential advantages over conventional vaccines. The advantages are more accurate regulation of the immune response activation, concentrating on most appropriate antigenic regions of a group of proteins (which are conserved and immunodominant in nature), as well as having advantages for production and biosafety. CD4+ T-cell epitopes display an important role in epitope-based vaccine design. The help of these cells is indispensable for the production of strong humoral and cytotoxic CD8+ T-cell responses. But the immune response to T-cell epitopes is limited only by HLA proteins. As a result, the HLA selectivity for T-cell epitopes develops a major constraint for epitope-based vaccine design, for genetically varied human populations. Two important factors cause major problems in epitope-based vaccine design. The most common problem is that MHC class II alleles are synthesized in different amounts in different ethnicities such as Asian and European populations. The second problem is that MHC class II genes are the most polymorphic genes in nature found in the human genome. Since the experimental testing of large sets of peptides of MHC molecules is very time-consuming as well as costly, in silico methods are more sought after methods to overcome these problems. Criteria-based analysis predicts two epitopes on this pollen allergen: YVADDGDIV and WRMDTAKAL. The T cell epitopes identified in this study provide insight into a peptide-based vaccine for type I hypersensitivity reaction induced by the Zea m 1 pollen allergen.

Conclusions
The crucial part of epitope-based vaccine design is its validation through in vivo and in vitro methods. Although two nonapeptides are identified by various motif based, statistical and structure-based methods, the experimental verification two epitopes YVADDGDIV and WRMDTAKAL is necessary to construct a vaccine against the Zea m 1 pollen allergen. This research work provides not only a novel pathway to design a peptide-based vaccine design for the Zea m1 pollen allergen, but at the same time the effectiveness of these two T cell epitopes is verified for 14 grass pollen allergens. Cross-reactivity occurs very frequently among these pollen allergen molecules due to very high sequence similarity in antigenic protein sequences. Almost conserved epitope sequences in these homologous proteins indicate the effectiveness of these predicted epitopes as probable vaccines for group 1 grass allergens. Population coverage calculation shows their efficiency in Asian populations. From a diagnostic point of view, these two T cell epitopes have immense importance for detecting the sensitivity of an individual towards the Zea m 1 pollen allergen. By preparing a monoclonal antibody with these two epitopes, diagnosis of a susceptible individual for hypersensitivity reaction in contact with a grass pollen allergen is possible. Allergen immune therapy with YVADDGDIV and WRMDTAKAL epitopes, will reduce immunogenic reactions in Zea m 1 sensitive populations.     Figure A2. Flowchart for molecular docking procedure.