Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics

Elias, Joshua E.; Gygi, Steven P.

doi:10.1007/978-1-60761-444-9_5

Joshua E. Elias³ &
Steven P. Gygi⁴

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 604))

6939 Accesses
373 Citations
26 Altmetric

Abstract

Accurate and precise methods for estimating incorrect peptide and protein identifications are crucial for effective large-scale proteome analyses by tandem mass spectrometry. The target-decoy search strategy has emerged as a simple, effective tool for generating such estimations. This strategy is based on the premise that obvious, necessarily incorrect “decoy” sequences added to the search space will correspond with incorrect search results that might otherwise be deemed to be correct. With this knowledge, it is possible not only to estimate how many incorrect results are in a final data set but also to use decoy hits to guide the design of filtering criteria that sensitively partition a data set into correct and incorrect identifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Eng, J. K., McCormack, A. L., and Yates, J. R. I. (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5, 976-89.
Article CAS Google Scholar
Perkins, D. N., Pappin, D. J., Creasy, D. M., and Cottrell, J. S. (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551-67.
Article CAS PubMed Google Scholar
Geer, L. Y., Markey, S. P., Kowalak, J. A., Wagner, L., Xu, M., Maynard, D. M., Yang, X., Shi, W., and Bryant, S. H. (2004) Open mass spectrometry search algorithm. J Proteome Res 3, 958-64.
Article CAS PubMed Google Scholar
Craig, R., and Beavis, R. C. (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466-7.
Article CAS PubMed Google Scholar
Keller, A., Nesvizhskii, A. I., Kolker, E., and Aebersold, R. (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74, 5383-92.
Article CAS PubMed Google Scholar
Deutsch, E. W., Lam, H., and Aebersold, R. (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep 9, 429-34.
Article CAS PubMed Google Scholar
Prince, J. T., Carlson, M. W., Wang, R., Lu, P., and Marcotte, E. M. (2004) The need for a public proteomics repository. Nat Biotechnol 22, 471-2.
Article CAS PubMed Google Scholar
Kersey, P. J., Duarte, J., Williams, A., Karavidopoulou, Y., Birney, E., and Apweiler, R. (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4, 1985-8.
Article CAS PubMed Google Scholar
(2008) The universal protein resource (UniProt). Nucleic Acids Res 36, D190-5.
Google Scholar
Bakalarski, C. E., Haas, W., Dephoure, N. E., and Gygi, S. P. (2007) The effects of mass accuracy, data acquisition speed, and search algorithm choice on peptide identification rates in phosphoproteomics. Anal Bioanal Chem 389, 1409-19.
Article CAS PubMed Google Scholar
Balgley, B. M., Laudeman, T., Yang, L., Song, T., and Lee, C. S. (2007) Comparative evaluation of tandem MS search algorithms using a target-decoy search strategy. Mol Cell Proteomics 6, 1599-608.
Article CAS PubMed Google Scholar
Elias, J. E., Haas, W., Faherty, B. K., and Gygi, S. P. (2005) Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat Methods 2, 667-75.
Article CAS PubMed Google Scholar
Sadygov, R. G., Cociorva, D., and Yates, J. R., III (2004) Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat Methods 1, 195-202.
Article CAS PubMed Google Scholar
Elias, J. E., and Gygi, S. P. (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4, 207-14.
Article CAS PubMed Google Scholar
Higdon, R., Hogan, J. M., Van Belle, G., and Kolker, E. (2005) Randomized sequence databases for tandem mass spectrometry peptide and protein identification. OMICS 9, 364-79.
Article CAS PubMed Google Scholar
Kall, L., Storey, J. D., MacCoss, M. J., and Noble, W. S. (2008) Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res 7, 29-34.
Article PubMed Google Scholar
Moore, R. E., Young, M. K., and Lee, T. D. (2002) Qscore: an algorithm for evaluating SEQUEST database search results. J Am Soc Mass Spectrom 13, 378-86.
Article CAS PubMed Google Scholar
Peng, J., Elias, J. E., Thoreen, C. C., Licklider, L. J., and Gygi, S. P. (2003) Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Proteome Res 2, 43-50.
Article CAS PubMed Google Scholar
Haas, W., Faherty, B. K., Gerber, S. A., Elias, J. E., Beausoleil, S. A., Bakalarski, C. E., Li, X., Villen, J., and Gygi, S. P. (2006) Optimization and use of peptide mass measurement accuracy in shotgun proteomics. Mol Cell Proteomics 5, 1326-37.
Article CAS PubMed Google Scholar
Beausoleil, S. A., Villen, J., Gerber, S. A., Rush, J., and Gygi, S. P. (2006) A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol 24, 1285-92.
Article CAS PubMed Google Scholar
Elias, J. E., Gibbons, F. D., King, O. D., Roth, F. P., and Gygi, S. P. (2004) Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat Biotechnol 22, 214-19.
Article CAS PubMed Google Scholar
Jiang, X., Han, G., Ye, M., and Zou, H. (2007) Optimization of filtering criterion for SEQUEST database searching to improve proteome coverage in shotgun proteomics. BMC Bioinformatics 8, 323.
Article PubMed Google Scholar
Kall, L., Canterbury, J. D., Weston, J., Noble, W. S., and MacCoss, M. J. (2007) Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods 4, 923-5.
Article PubMed Google Scholar
Binz, P. A., Barkovich, R., Beavis, R. C., Creasy, D., Horn, D. M., Julian, R. K., Jr., Seymour, S. L., Taylor, C. F., and Vandenbrouck, Y. (2008) Guidelines for reporting the use of mass spectrometry informatics in proteomics. Nat Biotechnol 26, 862.
Article CAS PubMed Google Scholar
Bradshaw, R. A., Burlingame, A. L., Carr, S., and Aebersold, R. (2006) Reporting protein identification data: the next generation of guidelines. Mol Cell Proteomics 5, 787-8.
Article CAS PubMed Google Scholar
Taylor, C. F. (2006) Minimum reporting requirements for proteomics: a MIAPE primer. Proteomics 6 Suppl 2, 39-44.
Article PubMed Google Scholar
Huttlin, E. L., Hegeman, A. D., Harms, A. C., and Sussman, M. R. (2007) Prediction of error associated with false-positive rate determination for peptide identification in large-scale proteomics experiments using a combined reverse and forward peptide sequence database strategy. J Proteome Res 6, 392-98.
Article CAS PubMed Google Scholar
Kall, L., Storey, J. D., MacCoss, M. J., and Noble, W. S. (2008) Posterior error probabilities and false discovery rates: two sides of the same coin. J Proteome Res 7, 40-4.
Article PubMed Google Scholar
Tang, W. H., Shilov, I. V., and Seymour, S. L. (2008) Nonlinear fitting method for determining local false discovery rates from decoy database searches. J Proteome Res 7(9):3661-7.
Article CAS PubMed Google Scholar
Nesvizhskii, A. I., Keller, A., Kolker, E., and Aebersold, R. (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75, 4646-58.
Article CAS PubMed Google Scholar

Download references

Acknowledgments

This work was supported in part by National Institutes of Health (NIH) GM67945 and HG00041 (S.P.G.).

Author information

Authors and Affiliations

Department of Cell Biology, Harvard Medical School, Boston, MA, USA
Joshua E. Elias
Taplin Biological Mass Spectrometry Facility, Harvard Medical School, Boston, MA, USA
Steven P. Gygi

Authors

Joshua E. Elias
View author publications
You can also search for this author in PubMed Google Scholar
Steven P. Gygi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steven P. Gygi .

Editor information

Editors and Affiliations

Fac. Life Sciences, University of Manchester, Oxford Rd., Manchester, M13 9PT, United Kingdom
Simon J. Hubbard
Fac. Veterinary Science, University of Liverpool, Liverpool, L69 7ZJ, United Kingdom
Andrew R. Jones

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Elias, J.E., Gygi, S.P. (2010). Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics. In: Hubbard, S., Jones, A. (eds) Proteome Bioinformatics. Methods in Molecular Biology™, vol 604. Humana Press. https://doi.org/10.1007/978-1-60761-444-9_5

Download citation

DOI: https://doi.org/10.1007/978-1-60761-444-9_5
Published: 05 December 2009
Publisher Name: Humana Press
Print ISBN: 978-1-60761-443-2
Online ISBN: 978-1-60761-444-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics