The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

Danelutto, Marco; De Matteis, Tiziano; De Sensi, Daniele; Mencagli, Gabriele; Torquati, Massimo; Aldinucci, Marco; Kilpatrick, Peter

doi:10.1007/s10766-017-0540-z

The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

Published: 28 November 2017

Volume 47, pages 74–93, (2019)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Marco Danelutto ORCID: orcid.org/0000-0002-7433-376X¹,
Tiziano De Matteis¹,
Daniele De Sensi¹,
Gabriele Mencagli¹,
Massimo Torquati¹,
Marco Aldinucci² &
…
Peter Kilpatrick³

627 Accesses
3 Citations
Explore all metrics

Abstract

We discuss the extended parallel pattern set identified within the EU-funded project RePhrase as a candidate pattern set to support data intensive applications targeting heterogeneous architectures. The set has been designed to include three classes of pattern, namely (1) core patterns, modelling common, not necessarily data intensive parallelism exploitation patterns, usually to be used in composition; (2) high level patterns, modelling common, complex and complete parallelism exploitation patterns; and (3) building block patterns, modelling the single components of data intensive applications, suitable for use—in composition—to implement patterns not covered by the core and high level patterns. We discuss the expressive power of the RePhrase extended pattern set and results illustrating the performances that may be achieved with the FastFlow implementation of the high level patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A brief introduction to distributed systems

Article Open access 16 August 2016

A survey on the evolution of stream processing systems

Article Open access 22 November 2023

Containerization technologies: taxonomies, applications and challenges

Article 08 June 2021

Notes

http://hadoop.apache.org/.
http://spark.apache.org/.
http://storm.apache.org/.
http://flink.apache.org/.
http://www-03.ibm.com/software/products/en/ibm-streams.
in the following we use Greek letters to denote data types. The expression \(x : \alpha \) is used to denote an object x whose type is \(\alpha \) while the expression \(f:\alpha \rightarrow \beta \) is used to denote a function f computing a result of type \(\beta \) out of an input of type \(\alpha \).
Where () is the “no parameter” (void) type.
http://parsec.cs.princeton.edu/.
http://calvados.di.unipi.it/fastflow.

References

A Streaming Process-based Skeleton Library for Erlang. https://github.com/ParaPhrase/skel (2015)
Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Pool evolution: a parallel pattern for evolutionary and symbolic computing. Int. J. Parallel Program. 44(3), 531–551 (2016)
Article Google Scholar
Aldinucci, M., Danelutto, M., Drocco, M., Kilpatrick, P., Misale, C., Peretti Pezzi, G., Torquati, M.: A parallel pattern for iterative stencil \(+\) reduce. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1871-z
Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: High-level and efficient streaming on multicore. In: Pllana, S., Xhafa, F. (eds.) Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, Chapter 13. Wiley, New York (2017)
Aldinucci, M., Peretti Pezzi, G., Drocco, M., Spampinato, C., Torquati, M.: Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern. Int. J. High Perform. Comput. Appl. 29(4), 461–472 (2015)
Article Google Scholar
Andrade, H., Gedik, B., Wu, K.-L., Yu, P.S.: Scale-up strategies for processing high-rate data streams in system S. In: Proceedings of the 2009 IEEE Int’l Conference on Data Engineering, ICDE ’09, pp. 1375–1378. IEEE Computer Society (2009)
Andrade, H.C., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge University Press, Cambridge (2014)
Book Google Scholar
Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009)
Article Google Scholar
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-First ACM Symposium on Principles of Database Systems, PODS ’02, pp. 1–16. ACM, New York (2002)
Danelutto, M., De Matteis, T., Mencagli, G., Torquati, M.: Data stream processing via code annotations. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1793-9
Danelutto, M., Torquati, M.: Structured parallel programming with “core” fastflow. In: Zsók, V., Horváth, Z., Csató, L. (eds.), Central European Functional Programming School, volume 8606 of LNCS, pp. 29–75. Springer, Berlin (2015)
De Matteis, T., Mencagli, G.: Parallel patterns for window-based stateful operators on data streams: an algorithmic skeleton approach. Int. J. Parallel Program. 45(2), 382–401 (2017)
Article Google Scholar
De Sensi, D., De Matteis, T., Torquati, M., Mencagli, G., Danelutto, M.: Bringing parallel patterns out of the corner: the p3 arsec benchmark suite. ACM Trans. Archit. Code Optim. 14(4), 33:1–33:26 (2017)
Article Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
del Rio Astorga, D., Dolz, M.F., Sánchez, L.M., Blas, J.G., García, J.D.: A C\({++}\) generic parallel pattern interface for stream processing. In: Algorithms and Architectures for Parallel Processing—16th Int’l Conference, ICA3PP 2016, Granada, Spain, 14–16 December 2016, Proceedings, pp. 74–87 (2016)
Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(2), 173–193 (2011)
Article MathSciNet Google Scholar
Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pp. 36:1–36:11. IEEE Press, Piscataway (2008)
Emoto, K., Matsuzaki, K.: An automatic fusion mechanism for variable-length list skeletons in SkeTo. Int. J. Parallel Program. 42(4), 546–563 (2014)
Article Google Scholar
Enmyren, J., Kessler, C.W.: Skepu: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings of the Fourth Int’l Workshop on High-Level Parallel Programming and Applications, HLPP ’10, pp. 5–14. ACM (2010)
Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. IJHPCN 7(2), 129–138 (2012)
Article Google Scholar
Korinth, J., de la Chevallerie, D., Koch, A.: An open-source tool flow for the composition of reconfigurable hardware thread pool architectures. In: 23rd IEEE Annual Int’l Symposium on Field-Programmable Custom Computing Machines, FCCM 2015, Vancouver, Canada, 2–6 May 2015, pp. 195–198. IEEE Computer Society (2015)
Laney, D.: 3D data management: controlling data volume, velocity, and variety. Technical report, META Group (2001)
Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st edn. Manning Publications Co., Greenwich (2015)
Google Scholar
Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional, Boston (2004)
MATH Google Scholar
Microsoft. Task Parallel Library (TPL).: https://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx (2017)
Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. JAMA 309(13), 1351–1352 (2013)
Article Google Scholar
Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barabási, A.-L.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166+ (2015)
Article Google Scholar
Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc, Sebastopol (2007)
Google Scholar
RePhrase. Report on defined use cases. RePhrase report D6.3 (2015)
Weiler, A., Grossniklaus, M., Scholl, M.H.: An evaluation of the run-time and task-based performance of event detection techniques for Twitter. Inf. Syst. 62(C), 207–219 (2016)
Article Google Scholar
White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media, Inc., Sebastopol (2009)
Google Scholar
Wilson, G., Irvin, R.: Assessing and comparing the usability of parallel programming systems. Technical Report, University of Toronto, http://littlesvr.ca/masters/wp-content/uploads/2010/02/cowichan.pdf (1995)
Wright, A.: Big data meets big science. Commun. ACM 57(7), 13–15 (2014)
Article Google Scholar
Zhao, S., Chandrashekar, M., Lee, Y., Medhi, D.: Real-time network anomaly detection system using machine learning. In: 11th Int’l Conference on the Design of Reliable Communication Networks, DRCN 2015, Kansas City, MO, USA, 24–27 March 2015, pp. 267–270 (2015)

Download references

Acknowledgements

This work has been partially funded by the EU H2020-ICT-2014-1 Project No. 644235 RePhrase “Refactoring Parallel Heterogeneous Resource-Aware Applications” (http://www.reprhase-ict.eu).

Author information

Authors and Affiliations

University of Pisa, Pisa, Italy
Marco Danelutto, Tiziano De Matteis, Daniele De Sensi, Gabriele Mencagli & Massimo Torquati
University of Torino, Turin, Italy
Marco Aldinucci
Queen’s University Belfast, Belfast, UK
Peter Kilpatrick

Authors

Marco Danelutto
View author publications
You can also search for this author in PubMed Google Scholar
Tiziano De Matteis
View author publications
You can also search for this author in PubMed Google Scholar
Daniele De Sensi
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Mencagli
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Torquati
View author publications
You can also search for this author in PubMed Google Scholar
Marco Aldinucci
View author publications
You can also search for this author in PubMed Google Scholar
Peter Kilpatrick
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Danelutto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Danelutto, M., De Matteis, T., De Sensi, D. et al. The RePhrase Extended Pattern Set for Data Intensive Parallel Computing. Int J Parallel Prog 47, 74–93 (2019). https://doi.org/10.1007/s10766-017-0540-z

Download citation

Received: 26 May 2017
Accepted: 18 November 2017
Published: 28 November 2017
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10766-017-0540-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

A survey on the evolution of stream processing systems

Containerization technologies: taxonomies, applications and challenges

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

A survey on the evolution of stream processing systems

Containerization technologies: taxonomies, applications and challenges

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation