Introduction

HCV infection is a serious global health problem and causes chronic hepatitis, liver cirrhosis, and HCC (Akuta et al. 2007; Ajorloo et al. 2015; Alborzi et al. 2017, 2015; Moayedi et al. 2018; Hashempoor et al. 2018). It is estimated that 160 million people are infected worldwide (Lavanchy 2011). The prevalence rate of HCV infection is from 0.2 up to 40% in different countries, and this prevalence in Iran is 0.16% (Sefidi et al. 2013).

Lack of an effective vaccine and therapeutic choices has leaded to the rapid growth of HCV infection (Lauer and Walker 2001).

HCV has six large different genotypes (1–6) and 70 distinct subtypes (a, b, c, etc.) globally (Martro et al. 2011). Genotyping analysis showed 30–33% difference in each genotype, and in subtypes around 20–25% (Sefidi et al. 2013). According to the current investigations in Iran, the predominant HCV subtype is 1a, followed by 3a and 1b (Sefidi et al. 2013).

HCV is a positive-strand RNA virus encoding three structural components, the core protein, and two E1 and E2 envelope glycoproteins (Ajorloo et al. 2015).

The core protein has many confirmed roles: core binds RNA and DNA and has an important function in RNA packing. It has been determined that HCV-core is a nucleic acid chaperone similar to retroviral nucleocapsid (NC) proteins in act acting to rearrange HCV 3′UTR, resulting in RNA dimerization in vitro (Caval et al. 2011; Cristofari Gl; Ivanyi-Nagy et al. 2004; Steinmann et al. 2008).

HCV-core is a highly basic protein that forms the viral NC and has interactions with cellular proteins and signal transduction pathways. As a result of HCV-core and host cell interactions, core may have a function in persistent infection and the pathogenesis of HCV liver disease (Polyak et al. 2006; Hashempour et al. 2015).

Core protein consisted of three predicted domains: aa1-aa117 domain 1 (domain D1), aa117-aa177 domain 2 (domain D2), and 177–191 domain 3 (domain D3) (Strosberg et al. 2010).

Domain 1 contains frequent positively charged amino acids, and is involved in RNA binding, promotes dimerization of the viral RNA, and has a significant role in NC formation and core envelopment by endosomal membranes (Ivanyi-Nagy et al. 2006).

Several identified mutations in domain 1 are involved in the development of HCC and hepatocarcinogenesis, the efficacy of triple therapy, and interaction between core and CXCL6 (Akuta et al. 2007, 2010, 2011; Fishman et al. 2009; Ogata et al. 2002; Takahashi et al. 2001; Idrees and Ashfaq 2013).

Humoral and cellular immune responses against HCV infections are inefficient and there is no convincing explanation to understand HCV immune pathogenesis (Gremion and Cerny 2005). However, HCV-specific IgM and IgG together were detected in acute infection. There are some outstanding proofs supporting a role for Abs in control of HCV infection and especially in reinfection (Cashman et al. 2014).

Some researchers have described a rise in anti-HCV humeral immune after immunization with core protein (Aghasadeghi et al. 2006). Other researchers have claimed that in acute HCV infection the titer of antibodies is very low, and delay in neutralizing antibody production is responsible for ineffective ability to prevent HCV infection (Netski et al. 2005). Other sources have introduced a large diversity of epitopes in HCV proteins as a possible way to escape the humoral response (Pavio and Lai 2003).

Cellular immune responses have an important role in the clearance of HCV infection. Patients with cellular immune dysfunction like human immunodeficiency virus (HIV) have rapid HCV progression. Some researchers have claimed that this system promotes liver injury by cytolysis activity of infected cells. In spite of cellular immune responses, HCV often evades recognition and has the ability to persist (Ward et al. 2002).

Several studies have described a number of pathways for HCV to escape from cell responses, including impaired oligo-/mono-specific or no virus-specific CD4+ and CD8+, mutation of epitopes, weakness of proliferative capacity, and cytotoxicity and ability to secrete TNF-α and IFN-γ by CD8+ T cells (Neumann-Haefelin et al. 2005). In addition, regulatory T cells (Tregs) induced by HCV infection have a significant role in the impaired activity of cellular immune response (Hashempour et al. 2015; Hashempoor et al. 2010).

Bioinformatics tools are efficient means to study viruses and different parts of the HCV genome like core domains (Idrees and Ashfaq 2013; Moattari et al. 2015; Dehghani et al. 2017; Nezafat et al. 2018; Atapour et al. 2018; Sarvari et al. 2014; Behzad Dehghani and Zahra Hasanshahi 2019). Bioinformatics tools are efficient means to study viruses and different parts of the HCV genome like core domains. Many programs have been developed to analyze function, structures, and modification of core protein, providing a large amount of information about core domains and important mutation sites (Akuta et al. 2007, 2010, 2011).

Current data are useful for prediction of HCV disease development and treatment response. In this study, we employed several bioinformatics tools to find important mutations in domain 1 of the core protein, general properties of B-cell and T-cell epitopes, modification sites, and structure of domain 1 in Iranian HCV infected samples from 2006 to 2017.

Materials and Methods

Sequence Alignment and Phylogenetic Tree

Domain 1 sequences of 188 Iranian HCV samples and reference sequences of HCV genotypes that were registered in NCBI gene bank (http://www.ncbi.nlm.nih.gov/) from 2006 to 2017 were downloaded. Homology among sequences was determined using multiple sequence alignment available in CLC- sequence viewer software under the following parameters: gap open cost, 10; gap extension cost, 1.0; and very accurate progressive alignment algorithm. Also, phylogenetic trees were analyzed through CLUSTAL X software, version 1.81, by neighbor-joining times to confirm the reliability of phylogenetic trees. The accession numbers of all sequences are displayed in Table 1.

Table 1 The accession numbers of all 188 sequences that were used in this study

Determination of Mutations

By considering previous studies, several significant mutations that are involved in: 1-hepatocellular carcinoma, 2-viral response to triple therapy, and 3-the interaction between core and CXCL6, were determined. All sequences were compared to find mentioned mutations (Akuta et al. 2007; Fishman et al. 2009; Ogata et al. 2002).

Physico-chemical Analysis

General properties of domain 1 (genotype 1a) were determined by employing “Expasy’sProtParam” (http://expasy.org/tools/protparam.html) and ProtScale at (http://web.expasy.org/protscale/) (Gasteiger et al. 2005).

B-Cell Epitopes Prediction

Chou and Fasman, Karplus and Schulz, Kolaskar & Tongaonkar, Emini, Parker, and BepiPred methods at http://www.immuneepitope.org (http://tools.immuneepitope.org/tools/bcell/iedb_input) were run for prediction of B-Cell epitopes positions (Chou and Fasman 2009; Karplus and Schulz 1985; Emini et al. 1985; Parker et al. 1986; Larsen et al. 2006).

On hydrophilicity, flexibility/mobility, accessibility, polarity, exposed surface and turns features by BcePred (http://www.imtech.res.in/raghava/bcepred) B-cell epitopes prediction were performed (Saha and Raghava 2004).

ABCpred software (http://www.imtech.res.in/raghava/abcpred/) predicted 16 meric B-cell epitopes (Saha and Raghava 2006a, b).

Prediction of T-Cell, CTL Epitopes and Allergic Properties

ProPred-I (http://www.imtech.res.in/raghava/propred1/) (Singh and Raghava 2003) was employed for MHC Class-I binding peptide prediction and proposed (http://www.imtech.res.in/raghava/propred/) was used for MHC Class-II binding peptide prediction. Programs were worked at a 4% default threshold by the proteasome and immunoproteasome filters on at 5% threshold (Singh and Raghava 2001).

MHC class I and II predictions were determined using the Immune Epitope Database (IEDB) (http://tools.immuneepitope.org/main/). For prediction of CTL epitopes, “ctlpred” and ANN methods were used (48).

Probability of antigenicity was expected by VaxiJen software at http://www.ddg-pharmfac.net (Doytchinova and Flower 2007).

IgE epitopes and allergic properties were estimated at http://www.imtech.res.in/raghava/algpred/index.html by using AlgPred (Saha and Raghava 2006).

Post-modification

Serine, threonine, and tyrosine phosphorylation sites prediction was done using DISPHOS (http://www.dabi.temple.edu/disphos/pred.html) (Iakoucheva et al. 2004) and NetPhos (http://www.cbs.dtu.dk/services/NetPhos/) (Blom et al. 1999). Kinase specific phosphorylation sites were determined by NetPhosK (http://www.cbs.dtu.dk/services/NetPhosK/) (Blom et al. 2004). NetNGlyc (http://www.cbs.dtu.dk/services/NetNGlyc/) (Gupta and Brunak 2002) and GlycoEP (http://www.imtech.res.in/raghava/glycoep/submit.html) were employed for N-glycosylation sites prediction (Chauhan et al. 2013).

Secondary and Tertiary Structure Prediction

To predict secondary and tertiary structures of core and domain 1 of genotype 1a, SOPMA at (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html) (Geourjon and Deleage 1995), I-TASSER at (http://zhanglab.ccmb.med.umich.edu/I-TASSER) (Roy et al. 2010), PHYRE2server at (http://www.sbg.bio.ic.ac.uk/~phyre2/html) (Kelley and Sternberg 2009), (PS)2-v2 Server at (http://ps2v2.life.nctu.edu.tw) (Chen et al. 2006) were employed. To evaluate the stereochemistry and quality of 3D structures Qmean at (http://swissmodel.expasy.org/qmean/cgi/index.cgi) (Benkert et al. 2008) was used, and the Ramachandaran plot was mapped by Rampage (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php).

The Signal Peptide Prediction

The Signal peptide was predicted by “Signal-BLAST”, and “SignalP 4.1 Server”.

Prediction of Epitopes Digestion: Peptide Cutter was Used to Determine Potential Cleavage Sites Cleaved by Proteases

Research Ethics

All data were collected anonymously in accordance with legal requirements regarding data protection and medical confidentiality. Approval from the Faculty Human Research Ethics Committee (Shiraz University of Medical Sciences) was obtained before the commencement of the study.

Results

Mutation and Phylogenic Tree

By considering all submitted sequences in NCBI GenBank we could not find any sequences related to 2017.

Phylogenetic tree for all sequences was shown in Fig. 1. All 2006 sequences were placed in a cluster at the bottom of the tree, and a sequence of 2012 has a high similarity to KF218585.1 (2014). The majority of sequences were closer to 1a and 3a than other reference sequences.

Fig. 1
figure 1

Phylogenetic tree based on domain1 sequences and by using neighbor joining method. The phylogenetic tree was constructed by the NJ method. The numbers at the forks show the numbers of occurrences of the repetitive groups to the right out of 100 bootstrap samples. All used reference sequences were showed after accession numbers (1a, 1b, and etc.). Sequences were categorized in five major clusters

All important mutation positions were listed in Table 2; the majority of mutations happened in 2013 and 2016 samples. No mutation was detected in12, 23, 25, 39, 45 positions.

Table 2 Frequency of all identified mutations in HCV-core domain1 sequences

Protparam analysis

Protparam results for domain 1 are listed in Table 3. Because of the high percentage of basic amino acids, domain 1 is a highly basic peptide (Theoretical pI: 12). The instability index, an estimate of the stability of a protein in a test tube, showed that domain 1 is an unstable peptide. Aliphatic index, a positive factor for the increase of thermostability of proteins, indicated that this peptide is a thermostable peptide. GRAVY is a hydropath city index and increasing positive score indicates a greater hydrophobicity, so this peptide is a hydrophilic peptide.

Table 3 Domain1 physicochemical properties computed by “Protparam”

ProtScale Analysis

Hydropathicity analysis by Kyte J. and Doolittle R.F. method showed that the major part of the peptide had a negative score; the maximum hydropath city score was on aa34 (valin) and the minimum hydropath city score was on aa 14 (asparagine).

Amino acids flexibility predicted by Bhaskaran R Ponnuswamy P.K method indicated that the maximum flexibility was around amino acid 58 (proline) and the minimum was around aa 95 (glycine).

Transmembrane (TM) tendency calculated by Zhao, G., London E. method, showed that the major part of peptide had a negative score, and the maximum transmembrane tendency was on aa34 (valine) and the minimum was on aa10 (lysine).

Peptide polarity predicted by Grantham R method showed that the maximum polarity was on amino acid 12 and 13 (lysine and arginine) and the minimum polarity was on aa 34 (valine).

B-Cell Epitopes Prediction

http://www.immuneepitope.org online software:

Chou and Fasman Beta-Turn Prediction, which is based on the rationale for predicting turns to predict antibody epitopes, showed one high score region, 99–112.

Emini Surface Accessibility Prediction, which is based on surface accessibility scale, showed two high score positions (4–20 and 49–61).

Karplus and Schulz flexibility scale was used for B-cell prediction; this method is based on mobility of protein segments on the basis of the known temperature B factors of the a-carbons of 31 proteins of known structure. Results demonstrated two positions with the highest score (50–60 and 5–12).

Kolaskar & Tongaonkar Antigenicity method is based on physicochemical properties of amino acid residues and their frequencies of occurrence in experimentally known segmental epitopes to predict antigenic determinants on protein. Results showed five positions of 20–39, 43–49, 63–69, 78–85, and 93–101.

Parker Hydrophilicity Prediction method is based on peptide retention times during high-performance liquid chromatography (HPLC) on a reversed-phase column. By this method, two regions were found: 9–16, 51–57.

Linear B-cell epitopes were determined using BepiPred. This method is based on a combination of a hidden Markov model and a propensity scale method. Three regions (1–25, 51–84, and 102–116) were founded by BepiPred analysis.

Bcepred Results

For a combination of all physicochemical properties (hydrophilicity, flexibility/mobility, accessibility, polarity, exposed surface, and turns) for linear B-cell epitope prediction based on physicochemical properties on a non-redundant dataset:

Using bcepred online software, three regions (4–22, 47–59, and 109–115) with the highest combined score were found.

Five 16 meric conserved regions (78, 49, 1, 64 and 102) were found by ABCpred prediction Server (Table 4).

Table 4 16 meric conserved B-cell epitopes regions in HCV-core domain1, predicted by ABCpred online software

VaxiJen Prediction

According to a predefined cutoff of VaxiJen program, domain 1 was confirmed as a probable antigen (model: virus and threshold: 0.4).

IgE Epitopes

The prediction of allergenic proteins by mapping of IgE epitope, SVM, and hybrid methods showed that domain 1 was not an allergen protein.

T Cell-Epitopes

Regarding T-cell responses against HCV, previous researches found some hosts’ human leukocyte antigen (HLA) alleles associations with HCV infection in Iranian patients. We found several epitopes of HLA’s shown in Table 5.

Table 5 HLA predicted epitopes in HCV-core domain1 sequence (genotype 1a) for HLA’s that were determined by previous researches in Iranian patients

Some studies found both CD4 helper and CD8 CTL responses against HCV infection. ctlpred found several epitopes for CTL (Table 6).

Table 6 CTL epitopes in HCV-core domain 1; the high score epitopes are displayed

Postmodification

Prediction of serine, threonine, and tyrosine phosphorylation sites by DISPHOS showed one position (116) in domain1.

By “NetPhos” software we found 10 phosphorylation sites (Fig. 2), 6 sites for serine (53, 56, 99, 103, 106, and 116) 3 sites for threonine (15, 49, and 52) and one site for tyrosine (86).

Fig. 2
figure 2

Phosphorylation sites prediction for domain 1 using “NetPhos” online software. Green lines indicate 6 sites for serine, blues lines show 3 sites for threonine, and one purple line shows tyrosine. All sites with scores above the threshold of 0.5 were considered as phosphorylation sites

NetPhosK results determined four phosphorylation sites, three threonine amino acids (3, 15, and 49) for protein kinase C and one serine (116) for protein kinase A. No glycosylation site was found by NetNGlyc and GlycoEP.

Secondary and Tertiary Structure Prediction

Secondary structure prediction for core and domain1 by using SOMPA software was summarized in Table 7 and Fig. 3.

Table 7 Percentage of secondary structures in core and domain1
Fig. 3
figure 3

Secondary structure prediction using SOMPA. Red region is extended strand, blue is the alpha helix, green is beta turn, and purple is the random coil. The majority of core structure belongs to random coil

SOMPA showed there was no alpha helix structure in domain1 and the major part of it was the random coil. But the combination of (PS)2-v2 and PHYRE2 showed there was an alpha helix structure in 8–15 region (Figs. 4, 5). All programs displayed extended strand in the 29–36 region.

Fig. 4
figure 4

Secondary structure prediction using PHYRE2. The result of this tool shows that the majority of the core structure (40%) is alpha helix which is indicated with green helix, also the confidence keys of the predicted structure for these regions are high

Fig. 5
figure 5

Secondary structure prediction using (PS)2-v2. C coil, H helix, and E extended strand. The majority of core structure contains coil structure

3D structures were determined by all three online software but only structures predicted by I-TASSER were reliable. Final structures (Figs. 6, 7) were validated by Qmean. QMEANscore and Z-score for calculated for core were 0.242 and − 5.43. The scores were not satisfactory but at least provided an overview of the core protein structure. QMEANscore and Z-score for domain 1 were 0.61 and − 1.24 confirming the quality and reliability of the predicted structure.

Fig. 6
figure 6

3D structure of the core protein using “I-TASSER” program. The selected model had the highest C score and it was qualified by “Qmean”

Fig. 7
figure 7

3D structure of domain1 using I-TASSER program. The selected model had the highest C score and it was qualified by “Qmean”

Ramachandran plot was assessed by RAMPAGE, and percentages of the favoured region, allowed region, and outlier region for core were 63.0%, 27.5%, and 9.5% respectively (Fig. 8). RAMPAGE results for domain1 showed 61.7% of residues in favored region and 24.3 in allowed region (Fig. 9). Figure 10 showed the showed T cell and B-cell epitopes on the surface of the core protein.

Fig. 8
figure 8

Ramachandran plot was used to visualize energetically allowed regions for backbone dihedral angles ψ against ϕ of amino acid residues in modeled protein structure (LCC model) for tertiary structure of core protein by RAMPAGE; the majority of amino acids residues were in favored region (119 amino acids) and allowed region (52 amino acids)

Fig. 9
figure 9

Ramachandran plot was used to visualize energetically allowed regions for backbone dihedral angles ψ against ϕ of amino acid residues in modeled protein structure (LCC model) for tertiary structure of domain1 by RAMPAGE; majority of amino acids residues were in favored region (71 amino acids) and allowed region (28 amino acids)

Fig. 10
figure 10

A: the position of the B-Cell epitopes (yellow region) on core tertiary structure and B: the T-cell epitopes (yellow region) on core 3D structure

Core Signal Peptide

Both online tools “Signal-BLAST” and “SignalP 4.1 Server” were not able to predict any signal peptide for core protein.

Cleavage Sites Prediction

The results of “PeptideCutter” prediction were summarized in Table 8. The prediction was done for all the predicted epitopes. According to the results, the antigenic epitopes that had the lower number of cleavage positions for enzymes were more potential for B cell or T cell epitopes.

Table 8 The results of predicted cleavage positions for 12 common proteases: B-cell, T-cell, and CTL predicted epitopes

Discussion

Although emerging bacterial viral diseases have caused great catastrophes in human history which can affect from a small and localized group to millions of people across continents, several vaccine and therapies have been introduced to control them (Dehghani et al. 2014, 2013).

Fishman et al. (2009) using multivariable logistic regression models ,found 10 HCV-core gene polymorphisms extensively associated with increased HCC risk (36G/C, 209A, 271C/U, 309A/C, 384U, 408U, 435A/C, 465U, 481A, 546A/C) and one significantly linked with decreased HCC risk (78U). Mentioned mutations related to change in domain1 amino acid sequence: N11S/T, K12Silent/N, A25V, G69S, Q70R, M91L, and L102P. All current amino acid changes decreased HCC risk except A25V (78U) (Fishman et al. 2009).

In selected sequences, we found that amino acid 11 was T in all sequences except one sequence in 2013 (P) and in 1a (N).

Positions 12(K) did not show any change; amino acid 25 was P and amino acid 69 was R in all sequences. Amino acid 70 in nearly all sequences was R except in 2 sequences in 2006 (Q), 5 sequences in 2013 (3(Q), 2(H), and 1(P)), 4 sequences in 2014 (Q), in 2016 [25(Q), 1(H) and 2 reference sequences (Q)].

Amino acid 91 in most sequences was C except 2 sequences in 2006(M), 17 sequences in 2013 (13(M), 3(L), 1(F)),50 sequences in 2016(43(M), 7(L)) and three ref sequences, (1b(M), 2a and 5a (L)). In amino acid 102 we did not find any change but in one of the reference sequences, we found one change (5a (G to S)).

Akuta et al. (2007) employed PCR for detecting substitutions of aa 70 and aa 91 in HCV-core gene of genotype 1b by using the mutation-specific primer as an important predictor of hepatocarcinogenesis. For wild samples, aa 70 was arginine (R) and aa 91 was leucine (L) but for mutant aa 70 was glutamine(Q)/histidine (H) and aa 91 was methionine (M) (Akuta et al. 2007).

Also, Furui et al. (2011) identified aa 70 and aa 91 substitutions among Japanese volunteer blood donors (Furui et al. 2011).

In terms of aa 70 substitutions, we recognized 2 glutamine substitutions in 2006 sequences and 3 glutamine, 2 histidines, one proline in 2013 sequences; also 4 glutamine substitutions in 2014 sequences, and 25 glutamine, and 1 histidine in 2016. We found several methionine residues in aa 91 in 2006, 2014, and 2016.

Ogata et al. (2002) compared sequences of the core protein of Subtype 1b HCV strains obtained from patients with and without HCC and found some amino acid mutation sites (Ogata et al. 2002). K23Q, Q70R, and T110M substitutions were found by Ogata et al. (2002). In comparison with our results, in all sequences, aa 23 was K, in the majority of sequences aa 70 was R and in 34 sequences it was Q. In aa 110 we did not find any methionine (Ogata et al. 2002).

Akuta et al. (2010) confirmed the role of Gln70 (or His70) in the efficacy of triple therapy and sustained a virological response, the patient with both genotype non-TT and Gln70 (His70) had the worst sustained virological response. Also, Akuta et al. (2011) by following up twenty-six patients determined the role of Gln70 (His70) substitution in the development of HCC They suggested detection of aa substitutions in the core region before antiviral therapy (Akuta et al. 2010, 2011).

Alestig et al. (2011) showed substitution in aa 70 of the core was related to treatment response, but that was less important than IL28B polymorphism. In our study, 37 sequences had Gln70 (or His70) substitution (Alestig et al. 2011).

Tokita et al. (2000) approved the role of HCV-core region (Thr49Pro) to reduce the fluorescence enzyme immunoassay (FEIA) sensitivity. We found 2 sequences in 2013 and one in 2016 with T to P substitutions (Tokita et al. 2000).

Findings of Horie et al. (1999) indicated that alteration from glycine to serine at core codon 45 was dominant in noncancerous liver portions rather than in cancerous liver portions and sera from HCC patients (Horie et al. 1999). Our results did not show any glycine to serine mutation in aa 45 and in all sequences, it was glycine.

Idrees and Ashfaq (2013) by using molecular docking software, reported interactions of amino acid residues arginine 149, arginine 39, arginine 74, and arginine 78 in HCV-core protein and Leu44, Ala71, Ser76 and Pro97 in CXCL6. This finding clues to understanding HCV pathogenesis. Any change in these positions can relate to HCV Infection and HCC. Our results showed no alteration in aa 39 and 74, and in aa 78 just one sequence was glutamine to arginine (Idrees and Ashfaq 2013).

Using a combination of predicted B-cell antibody epitopes by all methods on the immune epitope website and also considering bcepred and ABCpred prediction ,we could define three major epitopes (4–20, 50–60, and 100–112) for domain 1.

Ferroni et al. (1993) by using the algorithm of Jameson and Wolf identified four epitopes in HCV-core protein (7–21, 31–45, 49–63, and 99–113).

Harase et al. (1995) analyzed the response to HCV-core protein in mice and found a major B cell epitope (21–40).

Pirisi et al. (1995) analyzed Sera from 97 HCV infected patients and found three 15-mer peptides as antigens in an enzyme immunoassay. They concluded that anti-R15P (50–64, RKTSERSQPRGRRQP) as a potent antigen might help to identify a subgroup at higher risk to develop HCC (Pirisi et al. 1995).

Also, a study on HCV positive blood donor by Lechmann et al. (1996) determined a region (aa 1–24) of domain1 that bound the antibodies from the sera of all patients and showed a great potential for detection of HCV infection by using serological B-Cell responses tests.

Comparison of our results with previous studies revealed that all predicted epitopes in our research have a good potential for future studies of the immune response against HCV infection, and are useful for recognition of all kinds of HCV infections.

Gededzha et al. (2014) used several bioinformatics tools to predict HLA class I and HLA class II in HCV genotype 5a. They found three T-cell epitopes of NS3, NS4B, and NS5B.

Some HLA class II alleles were found in Iranian patients by Samimi-Rad et al. (2015); DRB1*0301, DQA1*0501, DQB1*0201, DRB1*1101, and DQB1*0301 were demonstrated in patients with HCV clearance, and DRB1*0701, DQA1*0201, DQB1*0602, DRB1*0301, DRB1*11, and DQB1*0201 occurred more frequently in chronic patients (Samimi-Rad et al. 2015).

Khorrami et al. (2015) found a relationship between HLA-G, IL-10, and response to combined therapy in HCV positive patients. They concluded that HLA-G, and IL-10 have a significant role in response to therapy with IFN-α2α and ribavirin.

Also, HLA-A01 and HLA-B38 were determined as important alleles associated with Peg-IFN plus ribavirin therapy in Egyptian patients by Farag et al. (2013).

Pourhassan et al. (2014) found several HLA alleles associated with HCV in Iranian patients (2011–2013). A2, A3, B35, B38, BW4, CW4and CW7 were the most frequent alleles found by this group.

In accordance with the above-mentioned studies, we collected HLA alleles associated with HCV infection, and by employing in-silico analysis we established numerous T-cell epitopes for domain1 that can be helpful for future studies to design effective vaccine against HCV genotype 1a, and can provide benefit data for better understanding the role of domain1 in immune response.

Many researchers have proved broader CTL responses to HCV infection and the usefulness of CTL epitopes mapping to develop therapeutic interventions or vaccines (Sabet et al. 2014; Saeedi et al. 2014; Jazayeri and Carman 2005; Arashkia et al. 2011). In our research, we utilized reliable software to predict CTL epitopes and extracted data that can be useful for vaccine development studies.

By considering results of phosphorylation sites prediction by 3 programs, we concluded that 3 sites (15, 49, and 116) were the main phosphorylation sites in domain 1. Amino acid 116 is a serine that located in Arg-Arg-Arg-Ser-Arg region; this region was similar to the usual target sequence for protein kinase A [Arg-Arg-X-Ser/Thr-X]. Two threonine amino acid residues (15, 49) were calculated as protein kinase c target sites where this kinase acts through the phosphorylation of hydroxyl groups of amino acid residue. Previous studies indicated that core protein is phosphorylated by PKA and PKC.

Core phosphorylation regulates the suppressive activity of HCV-core protein on HBV gene replication and expression. Also, it has been shown that phosphorylation in core relates to the nuclear localization of the core protein. They demonstrated three serine residues (Ser-53, Ser-116, and Ser-99) as the potential phosphorylated sites in core protein, that were similar to NetPhos software results in our research (Yassin 2001; Shih et al. 1995; Lu and Ou 2002).

Secondary structure prediction indicated that the majority of domain1 was the random coil, and all B-cell epitopes and important mutations placed on random coil structure.

Tertiary structures were designed by three significant and reliable online programs, but just one of them provided a reliable and high-quality protein structure model for domain1. The quality and reliability of models were confirmed by QMEAN and RAMPAGE software.

By examining all the predictions core epitopes with and without signal peptide, we found out that there is no difference between these two different strategies of analysis (Pene et al. 2009; Targett-Adams et al. 2008; Ma et al. 2007; Okamoto et al. 2008; Oehler et al. 2012).

Based on previous studies core protein has a C terminal signal peptide (170–191) and because the focus of our study was on domain 1, it was expected that the deletion of this region could not affect the epitope perdition results.

Digestion analysis to predict possible proteases was shown that each epitope can be digested by at least 5 selected proteases which can have a significant effect on the reduction of the half-life of epitopes.

Conclusion

Finally our investigations in this research provided comprehensive data about frequent mutations in domain1, and as a first report can be useful for future study about significant mutations and their role in therapeutic pathway and response to antiviral therapy for Iranian patients. Also, identification of domain1 properties provides practical information for domain1 cloning and more researches.