Skip to main content

Advertisement

Log in

The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

We discuss the extended parallel pattern set identified within the EU-funded project RePhrase as a candidate pattern set to support data intensive applications targeting heterogeneous architectures. The set has been designed to include three classes of pattern, namely (1) core patterns, modelling common, not necessarily data intensive parallelism exploitation patterns, usually to be used in composition; (2) high level patterns, modelling common, complex and complete parallelism exploitation patterns; and (3) building block patterns, modelling the single components of data intensive applications, suitable for use—in composition—to implement patterns not covered by the core and high level patterns. We discuss the expressive power of the RePhrase extended pattern set and results illustrating the performances that may be achieved with the FastFlow implementation of the high level patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://hadoop.apache.org/.

  2. http://spark.apache.org/.

  3. http://storm.apache.org/.

  4. http://flink.apache.org/.

  5. http://www-03.ibm.com/software/products/en/ibm-streams.

  6. in the following we use Greek letters to denote data types. The expression \(x : \alpha \) is used to denote an object x whose type is \(\alpha \) while the expression \(f:\alpha \rightarrow \beta \) is used to denote a function f computing a result of type \(\beta \) out of an input of type \(\alpha \).

  7. Where () is the “no parameter” (void) type.

  8. http://parsec.cs.princeton.edu/.

  9. http://calvados.di.unipi.it/fastflow.

References

  1. A Streaming Process-based Skeleton Library for Erlang. https://github.com/ParaPhrase/skel (2015)

  2. Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Pool evolution: a parallel pattern for evolutionary and symbolic computing. Int. J. Parallel Program. 44(3), 531–551 (2016)

    Article  Google Scholar 

  3. Aldinucci, M., Danelutto, M., Drocco, M., Kilpatrick, P., Misale, C., Peretti Pezzi, G., Torquati, M.: A parallel pattern for iterative stencil \(+\) reduce. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1871-z

  4. Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: High-level and efficient streaming on multicore. In: Pllana, S., Xhafa, F. (eds.) Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, Chapter 13. Wiley, New York (2017)

  5. Aldinucci, M., Peretti Pezzi, G., Drocco, M., Spampinato, C., Torquati, M.: Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern. Int. J. High Perform. Comput. Appl. 29(4), 461–472 (2015)

    Article  Google Scholar 

  6. Andrade, H., Gedik, B., Wu, K.-L., Yu, P.S.: Scale-up strategies for processing high-rate data streams in system S. In: Proceedings of the 2009 IEEE Int’l Conference on Data Engineering, ICDE ’09, pp. 1375–1378. IEEE Computer Society (2009)

  7. Andrade, H.C., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge University Press, Cambridge (2014)

    Book  Google Scholar 

  8. Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009)

    Article  Google Scholar 

  9. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-First ACM Symposium on Principles of Database Systems, PODS ’02, pp. 1–16. ACM, New York (2002)

  10. Danelutto, M., De Matteis, T., Mencagli, G., Torquati, M.: Data stream processing via code annotations. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1793-9

  11. Danelutto, M., Torquati, M.: Structured parallel programming with “core” fastflow. In: Zsók, V., Horváth, Z., Csató, L. (eds.), Central European Functional Programming School, volume 8606 of LNCS, pp. 29–75. Springer, Berlin (2015)

  12. De Matteis, T., Mencagli, G.: Parallel patterns for window-based stateful operators on data streams: an algorithmic skeleton approach. Int. J. Parallel Program. 45(2), 382–401 (2017)

    Article  Google Scholar 

  13. De Sensi, D., De Matteis, T., Torquati, M., Mencagli, G., Danelutto, M.: Bringing parallel patterns out of the corner: the p3 arsec benchmark suite. ACM Trans. Archit. Code Optim. 14(4), 33:1–33:26 (2017)

    Article  Google Scholar 

  14. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  15. del Rio Astorga, D., Dolz, M.F., Sánchez, L.M., Blas, J.G., García, J.D.: A C\({++}\) generic parallel pattern interface for stream processing. In: Algorithms and Architectures for Parallel Processing—16th Int’l Conference, ICA3PP 2016, Granada, Spain, 14–16 December 2016, Proceedings, pp. 74–87 (2016)

  16. Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(2), 173–193 (2011)

    Article  MathSciNet  Google Scholar 

  17. Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pp. 36:1–36:11. IEEE Press, Piscataway (2008)

  18. Emoto, K., Matsuzaki, K.: An automatic fusion mechanism for variable-length list skeletons in SkeTo. Int. J. Parallel Program. 42(4), 546–563 (2014)

    Article  Google Scholar 

  19. Enmyren, J., Kessler, C.W.: Skepu: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings of the Fourth Int’l Workshop on High-Level Parallel Programming and Applications, HLPP ’10, pp. 5–14. ACM (2010)

  20. Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. IJHPCN 7(2), 129–138 (2012)

    Article  Google Scholar 

  21. Korinth, J., de la Chevallerie, D., Koch, A.: An open-source tool flow for the composition of reconfigurable hardware thread pool architectures. In: 23rd IEEE Annual Int’l Symposium on Field-Programmable Custom Computing Machines, FCCM 2015, Vancouver, Canada, 2–6 May 2015, pp. 195–198. IEEE Computer Society (2015)

  22. Laney, D.: 3D data management: controlling data volume, velocity, and variety. Technical report, META Group (2001)

  23. Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st edn. Manning Publications Co., Greenwich (2015)

    Google Scholar 

  24. Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional, Boston (2004)

    MATH  Google Scholar 

  25. Microsoft. Task Parallel Library (TPL).: https://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx (2017)

  26. Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. JAMA 309(13), 1351–1352 (2013)

    Article  Google Scholar 

  27. Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barabási, A.-L.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166+ (2015)

    Article  Google Scholar 

  28. Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc, Sebastopol (2007)

    Google Scholar 

  29. RePhrase. Report on defined use cases. RePhrase report D6.3 (2015)

  30. Weiler, A., Grossniklaus, M., Scholl, M.H.: An evaluation of the run-time and task-based performance of event detection techniques for Twitter. Inf. Syst. 62(C), 207–219 (2016)

    Article  Google Scholar 

  31. White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media, Inc., Sebastopol (2009)

    Google Scholar 

  32. Wilson, G., Irvin, R.: Assessing and comparing the usability of parallel programming systems. Technical Report, University of Toronto, http://littlesvr.ca/masters/wp-content/uploads/2010/02/cowichan.pdf (1995)

  33. Wright, A.: Big data meets big science. Commun. ACM 57(7), 13–15 (2014)

    Article  Google Scholar 

  34. Zhao, S., Chandrashekar, M., Lee, Y., Medhi, D.: Real-time network anomaly detection system using machine learning. In: 11th Int’l Conference on the Design of Reliable Communication Networks, DRCN 2015, Kansas City, MO, USA, 24–27 March 2015, pp. 267–270 (2015)

Download references

Acknowledgements

This work has been partially funded by the EU H2020-ICT-2014-1 Project No. 644235 RePhrase “Refactoring Parallel Heterogeneous Resource-Aware Applications” (http://www.reprhase-ict.eu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Danelutto.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Danelutto, M., De Matteis, T., De Sensi, D. et al. The RePhrase Extended Pattern Set for Data Intensive Parallel Computing. Int J Parallel Prog 47, 74–93 (2019). https://doi.org/10.1007/s10766-017-0540-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-017-0540-z

Keywords

Navigation