Abstract
Current T cell epitope prediction tools are a valuable resource in designing targeted immunogenicity experiments. They typically focus on, and are able to, accurately predict peptide binding and presentation by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells. However, recognition of the peptide-MHC complex by a T cell receptor (TCR) is often not included in these tools. We developed a classification approach based on random forest classifiers to predict recognition of a peptide by a T cell receptor and discover patterns that contribute to recognition. We considered two approaches to solve this problem: (1) distinguishing between two sets of TCRs that each bind to a known peptide and (2) retrieving TCRs that bind to a given peptide from a large pool of TCRs. Evaluation of the models on two HIV-1, B*08-restricted epitopes reveals good performance and hints towards structural CDR3 features that can determine peptide immunogenicity. These results are of particular importance as they show that prediction of T cell epitope and T cell epitope recognition based on sequence data is a feasible approach. In addition, the validity of our models not only serves as a proof of concept for the prediction of immunogenic T cell epitopes but also paves the way for more general and high-performing models.
Similar content being viewed by others
References
Breiman L (2001) Random forests. Mach Learn 45:5–32. doi:10.1023/A:1010933404324
Calis JJA, Maybeno M, Greenbaum JA et al (2013) Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput Biol 9:e1003266. doi:10.1371/journal.pcbi.1003266
Carlos P, Roupie V, Holbert S et al (2015) In silico epitope analysis of unique and membrane associated proteins from Mycobacterium avium subsp. paratuberculosis for immunogenicity and vaccine evaluation. J Theor Biol 384:1–9. doi:10.1016/j.jtbi.2015.08.003
Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. Proc 25th Int Conf Mach learn - ICML ‘08 96–103. doi:10.1145/1390156.1390169
Cinelli M, Sun Y, Best K et al (2017) Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires. Bioinformatics. doi:10.1093/bioinformatics/btw771
Costa AI, Koning D, Ladell K et al (2015) Complex T-cell receptor repertoire dynamics underlie the CD8 T-cell response to HIV-1. J Virol 89:110–119. doi:10.1128/JVI.01765-14
Degroeve S, Martens L, Jurisica I (2013) MS2PIP: a tool for MS/MS peak intensity prediction. Bioinformatics 29:3199–3203. doi:10.1093/bioinformatics/btt544
Frahm N, Linde C, Brander C (2006) Identification of HIV-derived, HLA class I restricted CTL epitopes: insights into TCR repertoire, CTL escape and viral fitness
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Elements 1:337–387. doi:10.1007/b94608
Jenkins MK, Moon JJ (2012) The role of naive T cell precursor frequency and recruitment in dictating immune response magnitude. J Immunol 188:4135–4140. doi:10.4049/jimmunol.1102661
Jensen PE (2007) Recent advances in antigen processing and presentation. Nat Immunol 8:1041–1048. doi:10.1038/ni1516
Jorgensen JL, Esser U, Fazekas de St Groth B et al (1992) Mapping T-cell receptor–peptide contacts by variant peptide immunization of single-chain transgenics. Nature 355:224–230. doi:10.1038/355224a0
Krangel MS (2009) Mechanics of T cell receptor gene rearrangement. Curr Opin Immunol 21:133–139
Kursa MB, Rudnicki WR (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13 Vol. 36, Issue 11, Sep 2010
Lefranc MP, Giudicelli V, Duroux P et al (2015) IMGT R, the international ImMunoGeneTics information system R 25 years on. Nucleic Acids Res 43:D413–D422. doi:10.1093/nar/gku1056
Lundegaard C, Lund O, Nielsen M (2012) Predictions versus high-throughput experiments in T-cell epitope discovery: competition or synergy? Expert Rev Vaccines 11:43–54. doi:10.1586/erv.11.160
Meysman P, Fedorov D, Van Tendeloo V et al (2016) Immunological evasion of immediate-early varicella zoster virus proteins. Immunogenetics 68:483–486. doi:10.1007/s00251-016-0911-4
Meysman P, Ogunjimi B, Naulaerts S et al (2015) Varicella-zoster virus-derived major histocompatibility complex class I-restricted peptide affinity is a determining factor in the HLA risk profile for the development of postherpetic neuralgia. J Virol 89:962–969. doi:10.1128/JVI.02500-14
Motozono C, Kuse N, Sun X et al (2014) Molecular basis of a dominant T cell response to an HIV reverse transcriptase 8-mer epitope presented by the protective allele HLA-B*51:01. J Immunol 192:3428–3434. doi:10.4049/jimmunol.1302667
Mustafa AS (2013) In silico analysis and experimental validation of mycobacterium tuberculosis-specific proteins and peptides of mycobacterium tuberculosis for immunological diagnosis and vaccine development. Med Princ Pract 22:43–51
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Rossjohn J, Gras S, Miles JJ et al (2015) T cell antigen receptor recognition of antigen- presenting molecules. Annu Rev Immunol 33:169–200. doi:10.1146/annurev-immunol-032414-112334
Soria-Guerra RE, Nieto-Gomez R, Govea-Alonso DO, Rosales-Mendoza S (2015) An overview of bioinformatics tools for epitope prediction: implications on vaccine development. J Biomed Inform 53:405–414
Stranzl T, Larsen MV, Lundegaard C, Nielsen M (2010) NetCTLpan: pan-specific MHC class I pathway epitope predictions. Immunogenetics 62:357–368. doi:10.1007/s00251-010-0441-4
Sun Y, Best K, Cinelli M et al (2017) Specificity, privacy, and degeneracy in the CD4 T cell receptor repertoire following immunization. Front Immunol 8:1–12. doi:10.3389/fimmu.2017.00430
Turner SJ, Doherty PC, McCluskey J, Rossjohn J (2006) Structural determinants of T-cell receptor bias in immunity. Nat Rev Immunol 6:883–894. doi:10.1038/nri1977
Vita R, Overton JA, Greenbaum JA et al (2015) The immune epitope database (IEDB) 3.0. Nucleic Acids Res 43:D405–D412. doi:10.1093/nar/gku938
Acknowledgments
This research was funded by the University of Antwerp [BOF Concerted Research Action] and the Research Foundation Flanders (FWO) [Personal PhD grants to NDN (1S29816N), PMo (1141217N) and BC (11O1614N)]
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Online resource 1
List of TCRβs with MHC-epitope association used during training. TCRβs listed are either HLA-B*08-EIYKRWII or HLA-B*08-FLKEKGGL restricted. The list contains 4 columns describing respectively: V-family/gene region number, CDR3 amino acid sequence, J-family/gene region number and the target epitope. (CSV 10 kb)
Rights and permissions
About this article
Cite this article
De Neuter, N., Bittremieux, W., Beirnaert, C. et al. On the feasibility of mining CD8+ T cell receptor patterns underlying immunogenic peptide recognition. Immunogenetics 70, 159–168 (2018). https://doi.org/10.1007/s00251-017-1023-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00251-017-1023-5