research-article

Free Access

The Polyhedral Model of Nonlinear Loops

Authors:
Aravind Sukumaran-Rajam

INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France

INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France
View Profile

,
Philippe Clauss

INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France

INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France
View Profile

ACM Transactions on Architecture and Code Optimization Volume 12 Issue 4Article No.: 48pp 1–27https://doi.org/10.1145/2838734

Published:08 December 2015Publication History

ACM Transactions on Architecture and Code Optimization

Abstract

Runtime code optimization and speculative execution are becoming increasingly prominent to leverage performance in the current multi- and many-core era. However, a wider and more efficient use of such techniques is mainly hampered by the prohibitive time overhead induced by centralized data race detection, dynamic code behavior modeling, and code generation. Most of the existing Thread Level Speculation (TLS) systems rely on naively slicing the target loops into chunks and trying to execute the chunks in parallel with the help of a centralized performance-penalizing verification module that takes care of data races. Due to the lack of a data dependence model, these speculative systems are not capable of doing advanced transformations, and, more importantly, the chances of rollback are high. The polyhedral model is a well-known mathematical model to analyze and optimize loop nests. The current state-of-art tools limit the application of the polyhedral model to static control codes. Thus, none of these tools can generally handle codes with while loops, indirect memory accesses, or pointers. Apollo (Automatic POLyhedral Loop Optimizer) is a framework that goes one step beyond and applies the polyhedral model dynamically by using TLS. Apollo can predict, at runtime, whether the codes are behaving linearly or not, and it applies polyhedral transformations on-the-fly. This article presents a novel system that enables Apollo to handle codes whose memory accesses and loop bounds are not necessarily linear. More generally, this approach expands the applicability of the polyhedral model at runtime to a wider class of codes. Plugging together both linear and nonlinear accesses to the dependence prediction model enables the application of polyhedral loop optimizing transformations even for nonlinear code kernels while also allowing a low-cost speculation verification.

Supplemental Material

Available for Download

pdf

taco1204-48.pdf (350.8 KB)

Slide deck associated with this paper

References

U. Banerjee. 1993. Loop Transformations for Restructuring Compilers - The Foundations. Kluwer Academic Publishers. Google ScholarDigital Library
Kevin Barker, Thomas Benson, Dan Campbell, David Ediger, Roberto Gioiosa, Adolfy Hoisie, Darren Kerbyson, Joseph Manzano, Andres Marquez, Leon Song, Nathan Tallent, and Antonino Tumeo. 2013. PERFECT (Power Efficiency Revolution for Embedded Computing Technologies) Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute. http://hpc.pnnl.gov/projects/PERFECT/.Google Scholar
Emery D. Berger and Benjamin G. Zorn. 2006. DieHard: Probabilistic memory safety for unsafe languages. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'06). ACM, New York, NY, USA, 158--168. DOI:http://dx.doi.org/10.1145/1133981.1134000 Google ScholarDigital Library
Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A practical automatic polyhedral parallelizer and locality optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'08). ACM, New York, NY, USA, 101--113. DOI:http://dx.doi.org/10.1145/1375581.1375595 Google ScholarDigital Library
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In IISWC. IEEE, 44--54. Google ScholarDigital Library
Jacob Cohen, Patricia Cohen, Stephen G. West, and Leona S. Aiken. 2002. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Routledge.Google Scholar
Jean-François Collard. 1995. Automatic parallelization of while-loops using speculative execution. International Journal of Parallel Programming 23, 2 (April 1995), 191--219. DOI:http://dx.doi.org/10.1007/BF02577789 Google ScholarDigital Library
Jean-François Collard, Denis Barthou, and Paul Feautrier. 1995. Fuzzy array dataflow analysis. SIGPLAN Not. 30, 8 (Aug. 1995), 92--101. DOI:http://dx.doi.org/10.1145/209937.209947 Google ScholarDigital Library
Paul Feautrier and Christian Lengauer. 2011. Polyhedron model. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer US, 1581--1592. DOI:http://dx.doi.org/10.1007/978-0-387-09766-4_502Google Scholar
Grigori Fursin and Olivier Temam. 2010. Collective optimization: A practical collaborative approach. ACM Transactions on Architecture and Code Optimization 7, 4, Article 20 (Dec. 2010), 29 pages. DOI:http://dx.doi.org/10.1145/1880043.1880047 Google ScholarDigital Library
Stefan J. Geuns, Marco J. G. Bekooij, Tjerk Bijlsma, and Henk Corporaal. 2011. Parallelization of while loops in nested loop programs for shared-memory multiprocessor systems. In Design, Automation & Test in Europe Conference & Exhibition, DATE 2011. IEEE Computer Society, 1--6. http://doc.utwente.nl/78154/Google Scholar
Martin Griebl and Jean-Francois Collard. 1995. Generation of synchronous code for automatic parallelization of while loops. In Proceedings of the Euro-Par’95 Parallel Processing, First International Euro-Par Conference, Stockholm, Sweden, August 29-31, 1995. 315--326. DOI:http://dx.doi.org/10.1007/BFb0020474 Google ScholarDigital Library
Alexandra Jimborean, Philippe Clauss, Jean-François Dollinger, Vincent Loechner, and Martinez Juan Manuel. 2014. Dynamic and speculative polyhedral parallelization using compiler-generated skeletons. International Journal of Parallel Programming 42, 4 (Aug. 2014), 529--545. Google ScholarDigital Library
Troy A. Johnson, Rudolf Eigenmann, and T. N. Vijaykumar. 2007. Speculative thread decomposition through empirical optimization. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'07). ACM, New York, NY, USA, 205--214. DOI:http://dx.doi.org/10.1145/1229428.1229474 Google ScholarDigital Library
Christian Lengauer and Martin Griebl. 1994. On the Parallelization of Loop Nests Containing While Loops. Technical Report MIP-9414. Universitt Passau (DE). http://opac.inria.fr/record=b1040396Google Scholar
Wei Liu, James Tuck, Luis Ceze, Wonsun Ahn, Karin Strauss, Jose Renau, and Josep Torrellas. 2006. POSH: A TLS compiler that exploits program structure. In Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'06). ACM, New York, NY, USA, 158--167. DOI:http://dx.doi.org/10.1145/1122971.1122997 Google ScholarDigital Library
Gene Novark and Emery D. Berger. 2010. DieHarder: Securing the heap. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS’10). ACM, New York, NY, USA, 573--584. DOI:http://dx.doi.org/10.1145/1866307.1866371 Google ScholarDigital Library
Cosmin E. Oancea, Alan Mycroft, and Tim Harris. 2009. A lightweight in-place implementation for software thread-level speculation. In Proceedings of the Twenty-first Annual Symposium on Parallelism in Algorithms and Architectures (SPAA'09). ACM, New York, NY, USA, 223--232. DOI:http://dx.doi.org/10.1145/1583991.1584050 Google ScholarDigital Library
Arun Raman, Hanjun Kim, Thomas R. Mason, Thomas B. Jablin, and David I. August. 2010. Speculative parallelization using software multi-threaded transactions. In ACM SIGARCH Computer Architecture News, 38, 1 (March 2010), 65--76. DOI:http://dx.doi.org/10.1145/1735970.1736030 Google ScholarDigital Library
Easwaran Raman, Neil Va hharajani, Ram Rangan, and David I. August. 2008. Spice: Speculative parallel iteration chunk execution. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'08). ACM, New York, NY, USA, 175--184. DOI:http://dx.doi.org/10.1145/1356058.1356082 Google ScholarDigital Library
Lawrence Rauchwerger and David Padua. 1995. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI'95). ACM, New York, NY, USA, 218--232. DOI:http://dx.doi.org/10.1145/207110.207148 Google ScholarDigital Library
Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. 2012. Code generation for parallel execution of a class of irregular loops on distributed memory systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE Computer Society Press, Los Alamitos, CA, USA. Google ScholarDigital Library
Kevin Streit, Clemens Hammacher, Andreas Zeller, and Sebastian Hack. 2013. Sambamba: Runtime adaptive parallel execution. In Proceedings of the 3rd International Workshop on Adaptive Self-Tuning Computing Systems (ADAPT'13). ACM, New York, NY, USA, Article 7, 6 pages. DOI:http://dx.doi.org/10.1145/2484904.2484911 Google ScholarDigital Library
Aravind Sukumaran-Rajam, Juan Manuel Martinez, Willy Wolff, Alexandra Jimborean, and Philippe Clauss. 2014. Speculative program parallelization with scalable and decentralized runtime verification. In Runtime Verification, Borzoo Bonakdarpour and Scott A. Smolka (Eds.), Vol. 8734. Springer, Toronto, Canada, 124--139. DOI:http://dx.doi.org/10.1007/978-3-319-11164-3_11Google Scholar
Harmen L. A. van der Spek, Erwin M. Bakker, and Harry A. G. Wijshoff. 2008. SPARK00: A benchmark package for the compiler evaluation of irregular/sparse codes. CoRR abs/0805.3897 (2008).Google Scholar
Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Mills Strout. 2014. Non-affine extensions to polyhedral code generation. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14). ACM, New York, NY, USA, Article 185, 10 pages. DOI:http://dx.doi.org/10.1145/2544137.2544141 Google ScholarDigital Library
Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, and Francky Catthoor. 2013. Polyhedral parallel code generation for CUDA. ACM Transactions on Architecture and Code Optimization 9, 4, Article 54 (Jan. 2013), 23 pages. DOI:http://dx.doi.org/10.1145/2400682.2400713 Google ScholarDigital Library

Index Terms

The Polyhedral Model of Nonlinear Loops
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features

Recommendations

Non-affine Extensions to Polyhedral Code Generation
CGO '14: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization

This paper describes a loop transformation framework that extends a polyhedral representation of loop nests to represent and transform computations with non-affine index arrays in loop bounds and subscripts via a new interface between compile-time and ...
Read More
A polyhedral compilation framework for loops with dynamic data-dependent bounds
CC 2018: Proceedings of the 27th International Conference on Compiler Construction

We study the parallelizing compilation and loop nest optimization of an important class of programs where counted loops have a dynamic data-dependent upper bound. Such loops are amenable to a wider set of transformations than general while loops with ...
Read More
Automatic speculative parallelization of loops using polyhedral dependence analysis
COSMIC '13: Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores

Speculative Execution (SE) runs loops in parallel even in the presence of a dependence. Using polyhedral dependence analysis, more speculation candidate loops can be discovered than normal OpenMP parallelization. In this research, a framework is ...
Read More

Reviews

Reviewer: William M. Waite

The polyhedral model is a technique for optimizing loop nests in a program. Each calculated value is modeled by a point, and the dependencies among the values from one iteration to the next are modeled by vectors connecting the points. The result is an object called a polytope that lies in an n -dimensional space, where n is the number of loops. Each point of the polytope has the values of the loop indexes as its coordinates in that space. Performance improvements can be made by applying certain kinds of geometrical transformations to the polytope and then converting the result back to code. Analysis is carried out at compile time, based on static properties of the program, and must therefore make conservative assumptions. Previous papers proposed a framework to support dynamic analysis and speculative execution of the model. Here, Sukumaran-Rajam and Clauss show how to remove some of the constraints that the model places on the behavior of loop variables and the form of index expressions. After introducing the problem, the authors discuss the polyhedral model and its limitations. They then provide the architecture of Apollo, the system discussed in earlier papers, and explain how they extend it to remove its constraints. The paper concludes with a review of related work and some results obtained by applying the extended framework to programs selected from several standard benchmark suites. The paper is well written but quite dense; I would not recommend it to a novice in the optimization area. A reader is expected to understand multicore optimization techniques in general and the polyhedral model in particular. There is a good set of references to help access this material, and the examples are well chosen. A person familiar with the area should have little difficulty understanding and evaluating the authors' contribution. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Architecture and Code Optimization Volume 12, Issue 4
January 2016
848 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2836331
Editor:
Koen De Bosschere
Ghent University
Issue’s Table of Contents
Copyright © 2015 ACM
© 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 December 2015
- Revised: 1 October 2015
- Accepted: 1 October 2015
- Received: 1 May 2015
Published in taco Volume 12, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Speculative and dynamic loop parallelization
nonlinear memory references
polyhedral model
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 865
  Total Downloads
- Downloads (Last 12 months)97
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.