Abstract
The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride) provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues, and disease states. A PRIDE experiment typically includes identifications of proteins, peptides, and protein modifications. Additionally, many of the submitted experiments also include the mass spectra that provide the evidence for these identifications. Finally, one of the strongest advantages of PRIDE in comparison with other proteomics repositories is the amount of metadata it contains, a key point to put the above-mentioned data in biological and/or technical context. Several informatics tools have been developed in support of the PRIDE database. The most recent one is called “Database on Demand” (DoD), which allows custom sequence databases to be built in order to optimize the results from search engines. We describe the use of DoD in this chapter. Additionally, in order to show the potential of PRIDE as a source for data mining, we also explore complex queries using federated BioMart queries to integrate PRIDE data with other resources, such as Ensembl, Reactome, or UniProt.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Martens L, Hermjakob H, Jones P, Adamski M, Taylor C, States D, Gevaert K, Vandekerckhove J, Apweiler R (2005) PRIDE: the proteomics identifications database. Proteomics 5:3537–3545
Jones P, Cote RG, Martens L, Quinn AF, Taylor CF, Derache W, Hermjakob H, Apweiler R (2006) PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res 34:D659–D663
Jones P, Cote RG, Cho SY, Klie S, Martens L, Quinn AF, Thorneycroft D, Hermjakob H (2008) PRIDE: new developments and new datasets. Nucleic Acids Res 36:D878–D883
Craig R, Cortens JP, Beavis RC (2004) Open source system for analyzing, validating, and storing protein identification data. J Proteome Res 3:1234–1242
Deutsch EW, Lam H, Aebersold R (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep 9:429–434
Kandasamy K, Keerthikumar S, Goel R, Mathivanan S, Patankar N, Shafreen B, Renuse S, Pawar H, Ramachandra YL, Acharya PK, Ranganathan P, Chaerkady R, Keshava Prasad TS, Pandey A (2009) Human proteinpedia: a unified discovery resource for proteomics research. Nucleic Acids Res 37:D773–D781
Mead JA, Bianco L, Bessant C (2009) Recent developments in public proteomic MS repositories and pipelines. Proteomics 9:861–881
Anonymous (2007) Democratizing proteomics data. Nat Biotechnol 25:262
Anonymous (2008) Thou shalt share your data. Nat Methods 5:209
Cote RG, Jones P, Apweiler R, Hermjakob H (2006) The Ontology Lookup Service: a lightweight cross-platform tool for controlled vocabulary queries. BMC Bioinform 7:97
Cote RG, Jones P, Martens L, Apweiler R, Hermjakob H (2008) The Ontology Lookup Service: more data and better tools for controlled vocabulary queries. Nucleic Acids Res 36:W372–W376
Cote RG, Jones P, Martens L, Kerrien S, Reisinger F, Lin Q, Leinonen R, Apweiler R, Hermjakob H (2007) The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases. BMC Bioinform 8:401
Jones P, Cote R (2008) The PRIDE proteomics identifications database: data submission, query, and dataset comparison. Methods Mol Biol 484:287–303
Martens L, Jones P, Cote R (2008) Using the proteomics identifications database (PRIDE). Curr Protoc Bioinformatics. Chapter 13, Unit 13.8
Haider S, Ballester B, Smedley D, Zhang J, Rice P, Kasprzyk A (in press) BioMart central portal – unified access to biological data. Nucleic Acids Res. 37:W23–7
Martens L, Vandekerckhove J, Gevaert K (2005) DBToolkit: processing protein databases for peptide-centric proteomics. Bioinformatics 21:3584–3585
Gevaert K, Goethals M, Martens L, Van Damme J, Staes A, Thomas GR, Vandekerckhove J (2003) Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nat Biotechnol 21:566–569
Ghesquiere B, Van Damme J, Martens L, Vandekerckhove J, Gevaert K (2006) Proteome-wide characterization of N-glycosylation events by diagonal chromatography. J Proteome Res 5:2438–2447
Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A (2009) BioMart-biological queries made easy. BMC Genomics 10:22
Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B, Kanapin A, Lewis S, Mahajan S, May B, Schmidt E, Vastrik I, Wu G, Birney E, Stein L, D’Eustachio P (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37:D619–D622
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Vizcaíno, J.A., Reisinger, F., Côté, R., Martens, L. (2011). PRIDE and “Database on Demand” as Valuable Tools for Computational Proteomics. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-60761-987-1_6
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-986-4
Online ISBN: 978-1-60761-987-1
eBook Packages: Springer Protocols