Abstract
We discuss the extended parallel pattern set identified within the EU-funded project RePhrase as a candidate pattern set to support data intensive applications targeting heterogeneous architectures. The set has been designed to include three classes of pattern, namely (1) core patterns, modelling common, not necessarily data intensive parallelism exploitation patterns, usually to be used in composition; (2) high level patterns, modelling common, complex and complete parallelism exploitation patterns; and (3) building block patterns, modelling the single components of data intensive applications, suitable for use—in composition—to implement patterns not covered by the core and high level patterns. We discuss the expressive power of the RePhrase extended pattern set and results illustrating the performances that may be achieved with the FastFlow implementation of the high level patterns.
Similar content being viewed by others
Notes
in the following we use Greek letters to denote data types. The expression \(x : \alpha \) is used to denote an object x whose type is \(\alpha \) while the expression \(f:\alpha \rightarrow \beta \) is used to denote a function f computing a result of type \(\beta \) out of an input of type \(\alpha \).
Where () is the “no parameter” (void) type.
References
A Streaming Process-based Skeleton Library for Erlang. https://github.com/ParaPhrase/skel (2015)
Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Pool evolution: a parallel pattern for evolutionary and symbolic computing. Int. J. Parallel Program. 44(3), 531–551 (2016)
Aldinucci, M., Danelutto, M., Drocco, M., Kilpatrick, P., Misale, C., Peretti Pezzi, G., Torquati, M.: A parallel pattern for iterative stencil \(+\) reduce. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1871-z
Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: High-level and efficient streaming on multicore. In: Pllana, S., Xhafa, F. (eds.) Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, Chapter 13. Wiley, New York (2017)
Aldinucci, M., Peretti Pezzi, G., Drocco, M., Spampinato, C., Torquati, M.: Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern. Int. J. High Perform. Comput. Appl. 29(4), 461–472 (2015)
Andrade, H., Gedik, B., Wu, K.-L., Yu, P.S.: Scale-up strategies for processing high-rate data streams in system S. In: Proceedings of the 2009 IEEE Int’l Conference on Data Engineering, ICDE ’09, pp. 1375–1378. IEEE Computer Society (2009)
Andrade, H.C., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge University Press, Cambridge (2014)
Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun. ACM 52(10), 56–67 (2009)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-First ACM Symposium on Principles of Database Systems, PODS ’02, pp. 1–16. ACM, New York (2002)
Danelutto, M., De Matteis, T., Mencagli, G., Torquati, M.: Data stream processing via code annotations. J. Supercomput. (2016). https://doi.org/10.1007/s11227-016-1793-9
Danelutto, M., Torquati, M.: Structured parallel programming with “core” fastflow. In: Zsók, V., Horváth, Z., Csató, L. (eds.), Central European Functional Programming School, volume 8606 of LNCS, pp. 29–75. Springer, Berlin (2015)
De Matteis, T., Mencagli, G.: Parallel patterns for window-based stateful operators on data streams: an algorithmic skeleton approach. Int. J. Parallel Program. 45(2), 382–401 (2017)
De Sensi, D., De Matteis, T., Torquati, M., Mencagli, G., Danelutto, M.: Bringing parallel patterns out of the corner: the p3 arsec benchmark suite. ACM Trans. Archit. Code Optim. 14(4), 33:1–33:26 (2017)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
del Rio Astorga, D., Dolz, M.F., Sánchez, L.M., Blas, J.G., García, J.D.: A C\({++}\) generic parallel pattern interface for stream processing. In: Algorithms and Architectures for Parallel Processing—16th Int’l Conference, ICA3PP 2016, Granada, Spain, 14–16 December 2016, Proceedings, pp. 74–87 (2016)
Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(2), 173–193 (2011)
Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pp. 36:1–36:11. IEEE Press, Piscataway (2008)
Emoto, K., Matsuzaki, K.: An automatic fusion mechanism for variable-length list skeletons in SkeTo. Int. J. Parallel Program. 42(4), 546–563 (2014)
Enmyren, J., Kessler, C.W.: Skepu: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings of the Fourth Int’l Workshop on High-Level Parallel Programming and Applications, HLPP ’10, pp. 5–14. ACM (2010)
Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. IJHPCN 7(2), 129–138 (2012)
Korinth, J., de la Chevallerie, D., Koch, A.: An open-source tool flow for the composition of reconfigurable hardware thread pool architectures. In: 23rd IEEE Annual Int’l Symposium on Field-Programmable Custom Computing Machines, FCCM 2015, Vancouver, Canada, 2–6 May 2015, pp. 195–198. IEEE Computer Society (2015)
Laney, D.: 3D data management: controlling data volume, velocity, and variety. Technical report, META Group (2001)
Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st edn. Manning Publications Co., Greenwich (2015)
Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional, Boston (2004)
Microsoft. Task Parallel Library (TPL).: https://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx (2017)
Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. JAMA 309(13), 1351–1352 (2013)
Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barabási, A.-L.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166+ (2015)
Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc, Sebastopol (2007)
RePhrase. Report on defined use cases. RePhrase report D6.3 (2015)
Weiler, A., Grossniklaus, M., Scholl, M.H.: An evaluation of the run-time and task-based performance of event detection techniques for Twitter. Inf. Syst. 62(C), 207–219 (2016)
White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media, Inc., Sebastopol (2009)
Wilson, G., Irvin, R.: Assessing and comparing the usability of parallel programming systems. Technical Report, University of Toronto, http://littlesvr.ca/masters/wp-content/uploads/2010/02/cowichan.pdf (1995)
Wright, A.: Big data meets big science. Commun. ACM 57(7), 13–15 (2014)
Zhao, S., Chandrashekar, M., Lee, Y., Medhi, D.: Real-time network anomaly detection system using machine learning. In: 11th Int’l Conference on the Design of Reliable Communication Networks, DRCN 2015, Kansas City, MO, USA, 24–27 March 2015, pp. 267–270 (2015)
Acknowledgements
This work has been partially funded by the EU H2020-ICT-2014-1 Project No. 644235 RePhrase “Refactoring Parallel Heterogeneous Resource-Aware Applications” (http://www.reprhase-ict.eu).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Danelutto, M., De Matteis, T., De Sensi, D. et al. The RePhrase Extended Pattern Set for Data Intensive Parallel Computing. Int J Parallel Prog 47, 74–93 (2019). https://doi.org/10.1007/s10766-017-0540-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-017-0540-z