Skip to main content

Bio-Cirrus: A Framework for Running Legacy Bioinformatics Applications with Cloud Computing Resources

  • Conference paper
Advances in Computational Intelligence (IWANN 2013)

Abstract

Technological advances in biological and biomedical data acquisition are creating mountains of data. Existing legacy applications are unable to process this data without using new strategies. However, some workloads in bioinformatics are easily parallelized by splitting the data, running legacy applications in parallel and then join the partial results into one final result. In this paper, we present Bio-Cirrus, a software package which facilitates this process. Our software consists of a user-friendly client (jORCA) for accessing Web Services and enacting workflows, and a module (Mr. Cirrus) for processing the data with a map/reduce style approach. Bio-Cirrus binaries and documentation are freely available at http://www.bitlab-es.com/cloud under the Creative Commons Attribution-No Derivative Works 2.5 Spain License and its source code is available under request. (GPL v3 license).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amazon Elastic Map Reduce, http://aws.amazon.com/elasticmapreduce/

  2. Amazon Web Services, http://aws.amazon.com/

  3. IBM SmartCloud, http://www.ibm.com/cloud-computing/us/en/ .

  4. Program parameters for blastall, http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastall/

  5. The Flipper Web Service Registration Tool, http://chirimoyo.ac.uma.es/flipper/

  6. Windows Azure Storage API, http://msdn.microsoft.com/en-us/library/windowsazure/dd179355.aspx

  7. Borthakur, D.: The Hadoop Distributed File System: Architecture and Design, http://hadoop.apache.org/common/docs/r0.18.0/hdfs_design.pdf

  8. Gibbs, A.J., Mcintyre, G.A.: The diagram, a method for comparing sequences. European Journal of Biochemistry 16(1), 1–11 (1970)

    Article  Google Scholar 

  9. Karlsson, J., Torreño, O., Ramet, D., Klambauer, G., Cano, M., Trelles, O.: Enabling large-scale bioinformatics data analysis with cloud computing. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 640–645. IEEE (2012)

    Google Scholar 

  10. Karlsson, J., Trelles, O.: MAPI: a software framework for distributed biomedical applications. Journal of Biomedical Semantics 4(1), 4 (2013)

    Article  Google Scholar 

  11. Parsons, M.: Multiple challenges for multicore processors (2009), http://www.isgtw.org/?pid=1001952

  12. Martin-Requena, V., Ríos, J., García, M., Ramírez, S., Trelles, O.: JORCA: easily integrating bioinformatics Web Services. Bioinformatics 26(4), 553–559 (2010)

    Article  Google Scholar 

  13. Polychronopoulos, C.D., Kuck, D.J.: Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C-36(12), 1425–1439 (1987)

    Article  Google Scholar 

  14. Ramet, D., Lago, J., Karlsson, J., Falgueras, J., Trelles, O.: Mr-Cirrus: Implementación de Map-Reduce bajo MPI para la ejecución paralela de programas secuenciales. In: Proceedings of XXII Jornadas de Paralelismo, Las Palmas de Gran Canaria, España (2011)

    Google Scholar 

  15. Taylor, R.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11(suppl. 12), S1+ (2010)

    Google Scholar 

  16. Trelles, O., Prins, P., Snir, M., Jansen, R.C.: Big data, but are we ready? Nature Reviews Genetics 12(3), 224–224 (2011)

    Article  Google Scholar 

  17. Trelles-Salazar, O., Zapata, E.L., Carazo, J.M.: On an efficient parallelization of exhaustive sequence comparison algorithms on message passing architectures. Computer applications in the biosciences: CABIOS 10(5), 509–511 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karlsson, T.J.M. et al. (2013). Bio-Cirrus: A Framework for Running Legacy Bioinformatics Applications with Cloud Computing Resources. In: Rojas, I., Joya, G., Cabestany, J. (eds) Advances in Computational Intelligence. IWANN 2013. Lecture Notes in Computer Science, vol 7903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38682-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38682-4_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38681-7

  • Online ISBN: 978-3-642-38682-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics