In silico analysis of Shiga toxins (Stxs) to identify new potential vaccine targets for Shiga toxin-producing Escherichia coli

Shiga toxins belong to a family of structurally and functionally related toxins serving as the main virulence factors for pathogenicity of the Shiga toxin-producing Escherichia coli (STEC) associating with Hemolytic uremic syndrome (HUS). At present, there is no effective treatment or prevention for HUS. The aim of the present study was to find conserved regions within the amino acid sequences of Stx1, Stx2 (Shiga toxin) and their variants. In this regard, In-silico identification of conformational continuous B cell and T-cell epitopes was performed in order to introduce new potential vaccine candidates. 93–100% Homology was observed in Stx1 and its variants. In Stx2 and its variants, 69–100% homology was shown. By sequence alignment with Stx1 and Stx2, 54% homology was detected. T-cell epitope identification in Stx1A and Stx2A epitopes with highest binding affinity for each HLA (human leukocyte antigen) was demonstrated with 100% identity among all Stxs. B-cell epitope prediction was resulted in finding of four common epitopes between Stxs. In silico analysis of Stxs was resulted to identification of new peptide targets that could be used in development of new epitope vaccine candidates or in immunodiagnostic tests.

Abstract Shiga toxins belong to a family of structurally and functionally related toxins serving as the main virulence factors for pathogenicity of the Shiga toxin-producing Escherichia coli (STEC) associating with Hemolytic uremic syndrome (HUS). At present, there is no effective treatment or prevention for HUS. The aim of the present study was to find conserved regions within the amino acid sequences of Stx1, Stx2 (Shiga toxin) and their variants. In this regard, In-silico identification of conformational continuous B cell and T-cell epitopes was performed in order to introduce new potential vaccine candidates. 93-100% Homology was observed in Stx1 and its variants. In Stx2 and its variants, 69-100% homology was shown. By sequence alignment with Stx1 and Stx2, 54% homology was detected. T-cell epitope identification in Stx1A and Stx2A epitopes with highest binding affinity for each HLA (human leukocyte antigen) was demonstrated with 100% identity among all Stxs. B-cell epitope prediction was resulted in finding of four common epitopes between Stxs. In silico analysis of Stxs was resulted to identification of new peptide targets that could be used in development of new epitope vaccine candidates or in immunodiagnostic tests.

Background
Enterohaemorrhagic Escherichia coli (EHEC) strains are food-borne zoonotic pathogens that can cause of sporadic outbreaks, globally (Garcia-Angulo et al. 2013). EHEC can cause severe gastroenteritis, hemorrhagic colitis and in some cases, Hemolytic uremic syndrome (HUS) due to the production of Shiga toxins (Stxs) (Basu and Tumer 2015). HUS is a life-threatening infection, with a high mortality rate among children and old people. O157:H7 EHEC is the most common serotype associated with outbreaks, worldwide. Ruminants, especially cattle are considered as a reservoir for EHEC O157:H7 and fecal shedding leads to food contamination. Studies in cattle from high prevalence countries demonstrated carriage ranging from \1% to more than 30% (Garcia-Angulo et al. 2013).
Stxs are a group of genetically and structurally related exotoxins expressed by Shigella dysenteriae serotype 1 and some of the Shiga toxin-producing Escherichia coli (STEC) serotypes. Stxs are also known as VT (Verotoxin) due to the sensitivity of Vero cells to them (Pal 2015). STEC strains produce Stxs categorized as Stx1, Stx2 and also variants of either toxin. Stx1 variants are Stx1, Stx1c, Stx1d and Stx2 variants are Stx2, Stx2c, Stx2d, Stx2e, Stx2f and Stx2g. E. coli strains expressing Stx2 are potentially more virulent and they are more frequently associated with HUS (Pal 2015). The Stx genes in E. coli strains are coded with active or cryptic lambdoid bacteriophages and they are highly expressed upon activation of the lytic cycle of the phage (Mauro and Koudelka 2011;Johannes and Römer 2010). Stx family has an AB5 molecular structure, composed of five identical B subunits (7.7 kDa each) or binding domains and one enzymatically active A subunit (32 kDa). The B subunits are responsible for the attachment of holotoxin (AB5) to glycolipid receptor globotriaosylceramide (Gb3, or CD77) on the surface of host cells (Pal 2015;Johannes and Römer 2010). The A subunits are ribosome inactivating proteins (RIPs) that catalyze the irreversible damage in ribosome by modifying the large rRNA and finally inhibiting protein synthesis of host cells (Basu and Tumer 2015).
At present, there is no effective treatment or prevention for HUS. Administration of certain antibiotics has been shown to enhance the risk of HUS development. On the other hand, passively administered toxin-specific antibodies could effectively prevent toxin-mediated diseases. Therefore, several Stx2-specific monoclonal antibodies have been developed and many of them neutralized the activity of Stx2 in vitro or in vivo. In fact, Stx2 is much stronger than Stx1 in HUS infection. In addition, the essential virulence factor of EHEC, Stxs has been used for vaccination by induction of immunity responses to prevent intoxication. Primary studies demonstrated that vaccination with inactive Stx-derivatives could effectively induce the production of neutralizing antibodies and in some cases, induce protection against toxemia in animal models (Garcia-Angulo et al. 2013). Moreover, a number of studies aimed to evaluate the immunogenicity of the purified B subunit or its derivative, 2S protein. The results of the studies have been showed that the vaccines induced antibodies are able to neutralize the cytotoxic activity of both Stxs (Garcia-Angulo et al. 2013).
Therefore, finding protective B and T cell epitopes can be an effective approach for the development of a new antimicrobial vaccine by using bioinformatics tools. Additionally, identification of B-cell epitopes can lead to the development of therapeutics neutralizing antibody against microbes or their toxins. In fact, prediction of Stxs B-cell and T-cell epitopes with high affinity for antibodies MHC-II (Major histocompatibility complex) may lead to the introduction of a new vaccine candidate against STEC infections.
The purpose of the current study was to find conserved epitope regions within the Stx1, and Stx2 amino acid sequences in A and B subunits of their variants. Moreover, in silico identification of conformational continuous B-cell and T-cell epitopes have been performed in order to introduce a new potential vaccine candidate.

Stx1 and stx2 sequence analysis
The E. coli and Shigella full-length sequences of Stx1 and Stx2 subunits were obtained from GenBank and aligned using multiple sequence alignment software (http://work bench.sdsc.edu/) (Higgins et al. 1992). The identical fragments within protein sequences of A and B subunits were considered as the conserved areas.

Stx1 and stx2 structures analysis
The location of signal peptides within the stx1 and stx2 subunits was determined using SignalP server (Petersen et al. 2011). Surface accessibility, hydrophilicity and antigenicity of the Stx1 and Stx2 subunits were also determined using Immune Epitope Data Base (IEDB) analysis resource (http://www.iedb.org).

Tertiary structure prediction
In fact, the 3D structure of the Stx1 and Stx2f are not available from protein data bank, Stx1 and Stx2f subunits    protein modeling was performed using I-TASSER server (http://zhanglab.ccmb.med.umich.edu/I-TASSER) (Zhang 2008).

Validation and analysis of the 3D models
Energy minimization for the 3D models was performed using Swiss-PDB Viewer 4.1 software. Analysis of the 3D models was made using protein structure analysis (ProSa) server (https://prosa.services.came.sbg.ac.at/prosa.php) (Wiederstein and Sippl 2007) and Ramachandran Plot Analysis resource (RAMPAGE) (Lovell et al. 2002). The Z-score (overall model quality) and energy plots were created by ProSa server.
Alignment of the 3D structures was performed using Swiss-PDB Viewer and TM-align server (Zhang and Skolnick 2005) to calculate RMS (Root-mean-square deviation) and TM (Transactional memory) scores between models, respectively. These scores were represented the conformational differences between two proteins of interest. TM-score of Stx2 and Stx2f structures was calculated using TM-align server.

Prediction of antigenic T-cell epitope
The location of human MHC II epitopes within the Stx1 and Stx2 subunits sequences were predicted by IEDB analysis resource (Wang et al. 2010).

Prediction of allergenicity of the proteins
Prediction of allergenicity of the Stx1 and Stx2 protein sequences has been done using AlgPred tool (Saha and Raghava 2006).

Stxs protein sequences analysis
The sequences of Stx1, Stx2 and their variants were aligned using clustal W tool. The results have been summarized in the (Table 1). In A and B subunits of Stx1, 100% homology was shown between E. coli and Shigella. However, in A and B subunits of Stx2 between E. coli and Shigella 56% and 51% homology was detected, respectively. Additionally, the lowest percentage of sequence similarity with Stx2 was demonstrated in the Stx2f variant. The conformational homology with TMscore between Stx2 and Stx2f was found to predict the tertiary structure of this variant.

Analysis of Stxs protein structures
In order to predict the 3D model of the toxins, the location of the signal peptides were determined using SignalP server. The cleavage sites of A and B subunits were between (amino acids) aa22-aa23 and aa20-aa21, respectively. The results predicting the surface accessibility, hydrophilicity and antigenicity of the stx1 and stx2 subunits were also summarized in the (Table 2).

Secondary structure prediction
Prediction of Stxs secondary structures has performed using GOR4 server (Fig. 1). The results are summarized in (Table 3).

Tertiary structures prediction and analysis
Tertiary structure of Stx1 and Stx2f were predicted using I-TASSER server. Analysis of the models has been done using ProSa server after energy minimization of the 3D models by SPDBV software (Figs. 2,3,4,5). Evaluation of residues in Ramachandran plot was presented in Table 4. The Z-score of the tertiary models were within the range of scores typically found for native proteins of the similar size. Moreover, the C-score of the models demonstrated that the models are predicted with a high confidence (Table 4). Calculating the TM-score of Stx2 and Stx2f structures using TM-align server was indicated that they are in the same folding (TM-score 0.99) because TM scores between 0.5 and 1 is representative of same protein fold.

Prediction of B-cell epitopes
Prediction of continuous epitopes was performed using BCpred tool (Table 5). Conformational B-cell epitopes were predicted using Elipro tool ( Table 6). The strongest B-cell epitopes were selected according to the criteria based on cutoff values for BCpred and Elipro which were[0.8 and[0.5, respectively.

Prediction of T-cell epitopes
Prediction of T-cell epitopes was done for identification of epitopes having strong affinity for human MHC-II alleles.
The list of epitopes with highest binding affinity for MHC molecules is summarized in  c Knowledge-based energy plot of the best model (low energy = stability). d Ramachandran plot of the best model The best predicted Stx2 epitope was 100% conserved in Stx2 sequences but not in stx1 sequences. However, the HLADQA1* 05:01/DQB1*02:01 epitope was 100% identical in Stx1 and stx2 sequences. Analysis of its physicochemical properties showed that this 1693 Da peptide has 4.4 h half life in mammalian cells and it is also stable ( Table 8).

Docking of predicted T-cell epitopes to MHC molecules
Molecular docking was performed using Hex program to analyze the interaction of the best T-cell epitopes with MHC II molecules. Best binding epitopes were docked with HLADRB3*01:01, HLADRB5*01:01, HLADPA1*01:03/  Binding energy resulting from docking simulation has summarized in Table 7.

Prediction of allergenicity of Stxs
AlgPred tool was used to predict the allergenicity of the Stxs sequence (Threshold = 0.4). The results demonstrated c Knowledge-based energy plot of the best model (low energy = stability). d Ramachandran plot of the best model that A subunit of Stx1 and Stx2 (with scores of -1.6 and -1.2, respectively) were not allergens. However, B subunit of Stx1 and Stx2 were predicted to have allergens potential with scores of 0.5 and 0.8, respectively.

Discussion
Even though many strategies have been introduced to prevent HUS, effective therapeutics against Stxs infections are still in great demand. Antibiotics therapy is relatively dangerous since it can trigger subsequent bacterial increase and release more stxs that eventually raise the risk of severe clinical complications. Many efforts performed to find protective antigens and develop new vaccine candidates against HUS (Mukherjee et al. 2002;Cheng et al. 2009). Stxs can directly contribute to the EHEC pathogenesis because they have capability for eliciting protection against EHEC infections by inducing neutralizing antibodies. Previous study by Marcato et al. revealed that recombinant Stx2B subunit conjugated to keyhole limpet hemocyanin was able to elicit a protective immunity against Stx2 holotoxin in mice. The results of the study carried out by Ran et al. (2008) indicated that fusion protein consist of Stx2B subunit linked to the B subunit of E. coli heat-labile enterotoxin (LT-B) could elicit strong immunogenicity (Cheng et al. 2009).
In the present study, we aimed in silico analysis of the Stxs protein sequences and structures to find new potential vaccine candidates. With regard to the fact that conserved regions of the special antigen are good targets for developing vaccines, Firstly, amino acid sequences of Stx1 and Stx2 variants were analyzed by using alignment tools to find conserved areas among each Stx variants and also between Stx1 and Stx2. Our results demonstrated that Stx1, Stx1c and Stx1d have 93-100% homology. Except for Stx2 and Stx2f (showing 69% identity), Stx2 and other variants have 93-100% homology. Sequence alignment of Stx1 and Stx2 was shown that they have 54% homology. Since, the tertiary structure of Stx1 and Stx2f is not available, the 3D structures of these toxins were predicted using I-TASSER server. Models analysis was done using Prosa and SPDBV to show that both models are in the range of natural proteins of the same size. Calculating the TM-score of Stx2 and Stx2f 3D structures and also analysis of the secondary structure of both proteins demonstrated that both proteins are in almost the same fold and they are structurally identical.
Prediction of potential antigen allergenicity in the therapeutic proteins is realized as an important factor in vaccine therapy and a significant parameter by WHO (World health organization) (Saha and Raghava 2006). Prediction of allergenicity of Stxs by AlgPred was indicated that Stxs B subunits are potentially allergenic. Therefore, their application as vaccine candidate could be limited to epitope vaccines. On the other hand, the improved knowledge of antigen recognition at molecular level has contributed to the development of designed epitope vaccines. The major purpose of epitope vaccine design is to synthesize the identified B-cell and T-cell epitopes that are immunodominant and can elicit specific immune responses. Therefore, B-cell epitopes in conjugation with T-cell epitopes can increase the immunogenicity of the target molecule. Peptide vaccines have advantages for their chemical stability, absence of infectious potential, and ease of construction and production (Sette and Fikes 2003;Correia et al. 2014).
In order to find potential vaccine candidates, prediction of B-cell and T-cell epitopes was performed using bioinformatics tools. Presently, there are many techniques available for experimental identification of B-cell epitopes. Computational techniques offer a fast scalable cost-effective approach for: (1) predicting B-cell epitopes, (2) focusing experimental investigations, and (3) improving our knowledge of antigen-antibody interactions (EL-Manzalawy and Honavar 2010). Although a large majority of B-cell epitopes are discontinuous, epitope prediction methods have primarily focused on identifying linear B-cell epitopes that have crucial role in protection (EL-Manzalawy et al. 2008b). Linear B-cell epitope has vast application in the area of antibody production, immunodiagnostics, epitope-based vaccine design and selective immunization of therapeutic proteins (Singh et al. 2013).
In conclusion, a new peptide targets was identified by in silico analysis that could be used in development of new epitope vaccine candidates or in immunodiagnostic tests.

Compliance with ethical standards
Conflict of interest The authors do not have any conflict of interest.