Abstract
Major histocompatibility complex (MHC)-binding peptides are essential for antigen recognition by T-cell receptors and are being explored for vaccine design. Computational methods have been developed for predicting MHC-binding peptides of fixed lengths, based on the training of relatively few non-binders. It is desirable to introduce methods applicable for peptides of flexible lengths and trained by using more diverse sets of non-binders. MHC-BPS is a web-based MHC-binder prediction server that uses support vector machines for predicting peptide binders of flexible lengths for 18 MHC class I and 12 class II alleles from sequence-derived physicochemical properties, which were trained by using 4,208∼3,252 binders and 234,333∼168,793 non-binders, and evaluated by an independent set of 545∼476 binders and 110,564∼84,430 non-binders. The binder prediction accuracies are 86∼99% for 25 and 70∼80% for five alleles, and the non-binder accuracies are 96∼99% for 30 alleles. A screening of HIV-1 genome identifies 0.01∼5% and 5∼8% of the constituent peptides as binders for 24 and 6 alleles, respectively, including 75∼100% of the known epitopes. This method correctly predicts 73.3% of the 15 newly published epitopes in the last 4 months of 2005. MHC-BPS is available at http://bidd.cz3.nus.edu.sg/mhc/.
Similar content being viewed by others
References
Altuvia Y, Sette A, Sidney J, Southwood S, Margalit H (1997) A structure-based algorithm to predict potential binding peptides to MHC molecules with hydrophobic binding pockets. Hum Immunol 58:1–11
Bhasin M, Raghava GP (2004) Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22:3195–3204
Bian H, Hammer J (2004) Discovery of promiscuous HLA-II-restricted T cell epitopes with TEPITOPE. Methods 34:468–475
Burges CJC (1998) A tutorial on support vector machine for pattern recognition. Data mining and knowledge discovery 2:121–167
Cai CZ, Han LY, Ji ZL, Chen X, Chen YZ (2003) SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res 31:3692–3697
De Groot AS, Bishop EA, Khan B, Lally M, Marcon L, Franco J, Mayer KH, Carpenter CC, Martin W (2004) Engineering immunogenic consensus T helper epitopes for a cross-clade HIV vaccine. Methods 34:476–487
De Groot AS, Jesdale B, Martin W, Saint Aubin C, Sbai H, Bosma A, Lieberman J, Skowron G, Mansourati F, Mayer KH (2003) Mapping cross-clade HIV-1 vaccine epitopes using a bioinformatics approach. Vaccine 21:4486–4504
Donnes P, Elofsson A (2002) Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics 3:25
Donnes P, Kohlbacher O (2005) Integrated modeling of the major events in the MHC class I antigen processing pathway. Protein Sci 14:2132–2140
Doytchinova IA, Walshe VA, Jones NA, Gloster SE, Borrow P, Flower DR (2004) Coupling in silico and in vitro analysis of peptide-MHC binding: a bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes. J Immunol 172:7495–7502
Gotoh O (1993) Optimal alignment between groups of sequences and its application to multiple sequence alignment. Comput Appl Biosci 9:361–370
Guan P, Doytchinova IA, Zygouri C, Flower DR (2003) MHCPred: a server for quantitative prediction of peptide–MHC binding. Nucleic Acids Res 31:3621–3624
Han LY, Cai CZ, Lo SL, Chung MC, Chen YZ (2004) Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. RNA 10:355–368
Heyer LJ, Kruglyak S, Yooseph S (1999) Exploring expression data: identification and analysis of coexpressed genes. Genome Res 9:1106–1115
Honeyman MC, Brusic V, Stone NL, Harrison LC (1998) Neural network-based prediction of candidate T-cell epitopes. Nat Biotechnol 16:966–969
Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11:137–148
Kulkarni-Kale U, Bhosle S, Kolaskar AS (2005) CEP: a conformational epitope prediction server. Nucleic Acids Res 33:W168–W171
Larsen MV, Lundegaard C, Lamberth K, Buus S, Brunak S, Lund O, Nielsen M (2005) An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions. Eur J Immunol 35:2295–2303
Lichtman AKAAH (2005) Cellular and molecular immunology, updated edition (Book + Student Consult +Evolve. W.B. Saunders
Mallios RR (2001) Predicting class II MHC/peptide multi-level binding with an iterative stepwise discriminant analysis meta-algorithm. Bioinformatics 17:942–948
Matsumura M, Fremont DH, Peterson PA, Wilson IA (1992) Emerging principles for the recognition of peptide antigens by MHC class I molecules. Science 257:927–934
McFarland BJ, Beeson C (2002) Binding interactions between peptides and proteins of the class II major histocompatibility complex. Med Res Rev 22:168–203
Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O (2004) Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics 20:1388–1397
Nielsen M, Lundegaard C, Lund O, Kesmir C (2005) The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage. Immunogenetics 57:33–41
Parker KC, Bednarek MA, Coligan JE (1994) Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 152:163–175
Pelte C, Cherepnev G, Wang Y, Schoenemann C, Volk HD, Kern F (2004) Random screening of proteins for HLA-A*0201-binding nine-amino acid peptides is not sufficient for identifying CD8 T cell epitopes recognized in the context of HLA-A*0201. J Immunol 172:6783–6789
Petrovsky N, Brusic V (2004) Virtual models of the HLA class I antigen processing pathway. Methods 34:429–435
Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovic S (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219
Reche PA, Reinherz EL (2005) PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands. Nucleic Acids Res 33:W138–W142
Reche PA, Glutting JP, Zhang H, Reinherz EL (2004) Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics 56:405–419
Reche PA, Zhang H, Glutting JP, Reinherz EL (2005) EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology. Bioinformatics 21:2140–2141
Rudolph M, Stanfield R, Wilson I (2006) How TCRs bind MHCs, peptides, and coreceptors. Annu Rev Immunol 24:419–466
Schueler-Furman O, Altuvia Y, Sette A, Margalit H (2000) Structure-based prediction of binding peptides to MHC class I molecules: application to a broad range of MHC alleles. Protein Sci 9:1838–1846
Shoshan SH, Admon A (2004) MHC-bound antigens and proteomics for novel target discovery. Pharmacogenomics 5:845–859
Singh H, Raghava GP (2001) ProPred: prediction of HLA-DR binding sites. Bioinformatics 17:1236–1237
Singh H, Raghava GP (2003) ProPred1: prediction of promiscuous MHC class-I binding sites. Bioinformatics 19:1009–1014
Tenzer S, Peters B, Bulik S, Schoor O, Lemmel C, Schatz MM, Kloetzel PM, Rammensee HG, Schild H, Holzhutter HG (2005) Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding. Cell Mol Life Sci 62:1025–1037
Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machines. p 55–60
Zhang C, Anderson A, DeLisi C (1998) Structural principles that govern the peptide-binding motifs of class I MHC molecules. J Mol Biol 281:929–947
Zhang GL, Khan AM, Srinivasan KN, August JT, Brusic V (2005) MULTIPRED: a computational system for prediction of promiscuous HLA binding peptides. Nucleic Acids Res 33:W172–W179
Zhao Y, Pinilla C, Valmori D, Martin R, Simon R (2003) Application of support vector machines for T-cell epitopes prediction. Bioinformatics 19:1978–1984
Acknowledgement
This work was supported in part by grants from Singapore ARF R-151-000-031-112, Shanghai Commission for Science and Technology (04QMX1450) and the 973 National Key Basic Research Program of China (2004CB720103).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the supplementary material.
Table 1
Distribution of the binding peptides of different HLA alleles with respect to peptide length in units of the number of amino acids (34,692 kb)
Table 2
Data sets and the computed binder and non-binder prediction accuracies of the SVM prediction systems for different HLA alleles developed in this work. A total of 18 MHC class I and 12 MHC class II alleles are covered. TP, TN, FP, and FN are the number of true positive (true binder), true negative (true non-binder), false positive (false binder), and false negative (false non-binder), respectively. The total number of binders and non-binders in a data set is TP + FN and TN + FP, respectively) (42,217 kb)
Table 3
List of newly reported epitopes in the last 4 months of 2005 and SVM prediction results (5,351 kb)
Table 4
Statistics of the predicted peptide binders from the HIV-1 genome (NCBI entry NC_001802) by using our method and several other web-based prediction servers (130,095 kb)
Additional data sets for evaluation
Binders and non-binders are available in MHCBN database: http://bioinformatics.uams.edu/mirror/mhcbn/) and SYFPEITHI database http://www.syfpeithi.de/
Newly reported epitopes in the last 4 months of 2005 and SVM prediction results
NC_001802, Human immunodeficiency virus 1, complete genome ssRNA; linear; length, 9,181 nt
Rights and permissions
About this article
Cite this article
Cui, J., Han, L.Y., Lin, H.H. et al. MHC-BPS: MHC-binder prediction server for identifying peptides of flexible lengths from sequence-derived physicochemical properties. Immunogenetics 58, 607–613 (2006). https://doi.org/10.1007/s00251-006-0117-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00251-006-0117-2