Skip to main content

Storlet Engine for Executing Biomedical Processes Within the Storage System

  • Conference paper
  • First Online:
Book cover Business Process Management Workshops (BPM 2014)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 202))

Included in the following conference series:

  • 1848 Accesses

Abstract

The increase in large biomedical data objects stored in long term archives that continuously need to be processed and analyzed requires new storage paradigms. We propose expanding the storage system from only storing biomedical data to directly producing value from the data by executing computational modules - storlets - close to where the data is stored. This paper describes the Storlet Engine, an engine to support computations in secure sandboxes within the storage system. We describe its architecture and security model as well as the programming model for storlets. We experimented with several data sets and storlets including de-identification storlet to de-identify sensitive medical records, image transformation storlet to transform images to sustainable formats, and various medical imaging analytics storlets to study pathology images. We also provide a performance study of the Storlet Engine prototype for OpenStack Swift object storage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.visioncloud.eu.

  2. 2.

    https://wiki.openstack.org/wiki/Swift.

  3. 3.

    http://ensure-fp7.eu.

  4. 4.

    http://www.forgetit-project.eu.

  5. 5.

    https://www.docker.io.

References

  1. Factor, M., Naor, D., Rabinovici-Cohen, S., Ramati, L., Reshef, P., Satran, J., Giaretta, D.: Preservation DataStores: architecture for preservation aware storage. In: MSST 2007, Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies, San Diego, CA, pp. 3–15, September 2007

    Google Scholar 

  2. Rabinovici-Cohen, S., Marberg, J., Nagin, K., Pease, D.: PDS Cloud: Long term digital preservation in the cloud. In: IC2E 2013, Proceedings of the IEEE International Conference on Cloud Engineering, San Francisco, CA, March 2013

    Google Scholar 

  3. Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Lecture Notes for Stanford CS345A Web Mining (2011)

    Google Scholar 

  4. Rabinovici-Cohen, S., Henis, E., Marberg, J., Nagin, K.: Storlet engine: performing computations in cloud storage. Technical report H-0320, IBM Research - Haifa, August 2014

    Google Scholar 

  5. Shahar, Y.: The elicitation, representation, application, and automated discovery of time-oriented declarative clinical knowledge. In: Lenz, R., Miksch, S., Peleg, M., Reichert, M., Riaño, D., ten Teije, A. (eds.) ProHealth 2012 and KR4HC 2012. LNCS, vol. 7738, pp. 1–29. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Cooper, L., Carter, A., Farris, A., Wang, F., Kong, J., Gutman, D., Widener, P., Pan, T., Cholleti, S., Sharma, A., Kurç, T., Brat, D., Saltz, J.: Digital pathology: data-intensive frontier in medical imaging. Proc. IEEE 100(4), 317–323 (2012)

    Article  Google Scholar 

  7. Le, X., Wang, D.: Neuroimage data sets: rethinking privacy policies. In: HealthSec (2012)

    Google Scholar 

  8. Rabinovici-Cohen, S., Wolfson, O.: Why a single parallelization strategy is not enough in knowledge bases. J. Comput. Syst. Sci. 47(1), 2–44 (1993)

    Article  Google Scholar 

  9. Weil, S., Brandt, S., Miller, E., Long, D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: OSDI 2006, Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (2006)

    Google Scholar 

  10. OpenStack Savanna. https://wiki.openstack.org/wiki/Savanna

  11. ZeroVM. http://zerovm.org

Download references

Acknowledgments

The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013) under grant agreement 270000 and under grant agreement 600826.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simona Rabinovici-Cohen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rabinovici-Cohen, S., Henis, E., Marberg, J., Nagin, K. (2015). Storlet Engine for Executing Biomedical Processes Within the Storage System. In: Fournier, F., Mendling, J. (eds) Business Process Management Workshops. BPM 2014. Lecture Notes in Business Information Processing, vol 202. Springer, Cham. https://doi.org/10.1007/978-3-319-15895-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15895-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15894-5

  • Online ISBN: 978-3-319-15895-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics