The Proteome Folding Project: Proteome-scale prediction of structure and function
- Kevin Drew1,
- Patrick Winters1,
- Glenn L. Butterfoss1,
- Viktors Berstis2,
- Keith Uplinger2,
- Jonathan Armstrong2,
- Michael Riffle3,
- Erik Schweighofer4,
- Bill Bovermann2,
- David R. Goodlett5,
- Trisha N. Davis3,
- Dennis Shasha6,
- Lars Malmström7 and
- Richard Bonneau1,4,6,8
- 1Center for Genomics and Systems Biology, Department of Biology, New York University, New York, New York 10003, USA;
- 2IBM, Austin, Texas 78758, USA;
- 3Department of Biochemistry, Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
- 4Institute for Systems Biology, Seattle, Washington 98103, USA;
- 5Medicinal Chemistry Department, University of Washington, Seattle, Washington 98195, USA;
- 6Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, New York 10003, USA;
- 7Institute of Molecular Systems Biology, ETH Zurich, Zurich CH 8093, Switzerland
Abstract
The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
Footnotes
-
↵8 Corresponding author.
E-mail bonneau{at}nyu.edu.
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.121475.111. Freely available online through the Genome Research Open Access option.
- Received January 26, 2011.
- Accepted July 28, 2011.
- Copyright © 2011 by Cold Spring Harbor Laboratory Press
Freely available online through the Genome Research Open Access option.