Abstract
High performance computing has opened the door to using bioinformatics and systems biology to explore complex relationships among data, and created the opportunity to tackle very large and involved simulations of biological systems. Many supercomputing centers have jumped on the bandwagon because the opportunities for significant impact in this field is infinite. Development of new algorithms, especially parallel algorithms and software to mine new biological information and to assess different relationships among the members of a large biological data set, is becoming very important. This article presents our work on the design and development of parallel algorithms and software to solve some important open problems arising from bioinformatics, such as structure alignment of RNA sequences, finding new genes, alternative splicing, gene expression clustering and so on. In order to make these parallel software available to a wide audience, the grid computing service interfaces to these software have been deployed in China National Grid (CNGrid). Finally, conclusions and some future research directions are presented.
Similar content being viewed by others
References
Abramson, D. et al. 2006. Deploying Scientific Applications to the PRAGMA Grid Testbed: Strategies and Lessons. IEEE CS Press, 241–248.
Arabie, P., Hubbert, L.J. 1996. An Overview of Combinatorial Data Analysis. In: Arabie, P., Hubert, L. J., de Soete, G. (eds.), Clustering and Classication. World Scientific Publ, River Edge.
Arabie, P., Hubert, L.J., de Soete, G. 1996. Clustering and Classification. World Scientific Publ., River Edge.
Bader, D.A. 2004. Computational biology and Highperformance Computing. Communications of the ACM archive 47, 1–5.
Breitbart, R.E., Andreadis, A. 1987. Alternative splicing: a ubiquitous mechanism for the generation of multiple protein isoforms from single genes. Annu Rev Biochem 56, 467–495.
Chabot, B. 1996. Directing alternative splicing: Cast and scenarios. Trends Genet 12, 472–478.
de Berg, M., van Kreveld, M., Overmars, O., Schwarzkopf, O. 1997. Computational Geometry-Algorithms and Applications. Springer-Verlag, Berlin.
Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G. J. 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge UK.
Eddy, S.R. 2002. Computational genomics of noncoding RNA genes. Cell 109, 137–140.
Eddy, S.R. 2001. Non-coding RNA genes and the modern RNA world. Nature Rev Genet 2, 919–929.
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95, 14863–14868.
Franke, R.G., Nielson, M. 1991. Scattered Data Interpolation and Applications: A Tutorial and Survey. In: Hagen, H., Roller, D. (eds.), Geometric Modeling. Springer-Verlag, New York.
Gianluca, D.V., Riccardo, D. 2003. A Library of Efficient Bioinformatics Algorithms. Applied Bioinformatics 2, 117–121.
Gordan, A.D. 1996. Hierarchical Classification. In: Arabie, P., Hubert, L.J., de Soete, G. (eds.), Clustering and Classification. World Scientific Publ, River Edge.
Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S. R. 2003. Rfam: An RNA family database. Nucleic Acids Res 31, 439–441.
Griffiths-Jones, S. et al. 2005. Rfam: annotating noncoding RNAs in complete genomes. Nucleic Acids Research 33, 121–124.
Gropp, W., Lusk, E., Skjellum, A. 1994. Using Mpi: Portable Parallel Programming With the Message-Passing Interface, Scientific and Engineering Computation Series. MIT Press.
Lockhart, D.J., Dong, H., Byrne, M.C., Follettie, M.T., Gallo, M.V., Chee, M.S., Mittmann, M., Wang, C., Kobayashi, M., Horton, H. et al. 1996. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14, 1675–1680.
Pace, N.R. 1997. A molecular view of microbial diversity and the biosphere. Science. 276, 734–740.
Pace, N.R., Thomas, B.C., Woese, C.R. 1999. Probing RNA structure, function and history by comparative analysis. In: Gesteland, R.F., Cech, T.R., Atkins, J.F. (eds.), The RNA World, 2nd Edn. Cold Spring Harbor Laboratory Press.
Pearson, W.R., Lipman, D.J. 1988. Improved tools for biological sequence comparison. Proc Natl Acad Sci 85, 2444–2448.
Schena, M., Shalon, D., Davis, R.W. and Brown, P.O. 1995. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470.
Sokal, R.R., Michener, C.D. 1958. A statistical method for evaluating systematic relationships. Univ Kans Sci Bull 38, 1409–1438.
Storz, G. 2002. An expanding universe of noncoding RNAs. Science 296, 1260–1263.
Jiang, T., Xu, Y., Michael, Q.Z. 2002. Current Topics in Computational Molecular Biology. In: Algorithmic Methods for Multiple Sequence Alignment. MIT Press.
Velculescu, V.E., Zhang, L., Vogelstein, B., Kinzler, K.W. 1995. Serial analysis of gene expression. Science 270, 484–487.
Zomaya, A.Y. 2004. Parallel Computing for Bioinformatics and Computational Biology: Models, Enabling Technologies, and Case Studies. Series on Parallel and Distributed Computing. Wiley, 3–4.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Niu, BF., Lang, XY., Lu, ZH. et al. Parallel algorithm research on several important open problems in bioinformatics. Interdiscip Sci Comput Life Sci 1, 187–195 (2009). https://doi.org/10.1007/s12539-009-0004-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-009-0004-7