Skip to main content
Log in

Loop Shifting for Loop Compaction

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The idea of decomposed software pipelining is to decouple the software pipelining problem into a cyclic scheduling problem without resource constraints and an acyclic scheduling problem with resource constraints. In terms of loop transformation and code motion, the technique can be formulated as a combination of loop shifting and loop compaction. Loop shifting amounts to moving statements between iterations thereby changing some loop independent dependences into loop carried dependences and vice versa. Then, loop compaction schedules the body of the loop considering only loop independent dependences, but taking into account the details of the target architecture. In this paper, we show how loop shifting can be optimized so as to minimize both the length of the critical path and the number of dependences for loop compaction. The first problem is well-known and can be solved by an algorithm due to Leiserson and Saxe. We show that the second optimization (and the combination with the first one) is also polynomially solvable with a fast graph algorithm, variant of minimum-cost flow algorithms. Finally, we analyze the improvements obtained on loop compaction by experiments on random graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

REFERENCES

  1. John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach, 2nd ed., Chap. 4, Morgan-Kaufmann (1996).

  2. Carole Dulong, The IA-64 architecture at work, Computer, 31(7):24–32 (July 1998).

    Google Scholar 

  3. Vicki H. Allan, Reese B. Jones, Randall M. Lee, and Stephen J. Allan, Software pipelining, ACM Computing Surveys, 27(3):367–432 (September 1995).

    Google Scholar 

  4. Monica S. Lam, Software pipelining: An effective scheduling technique for VLIW machines, SIGPLAN'88 Conf. Progr. Lang. Design and Implementation, ACM Press, Atlanta, Georgia, pp. 318–328 (1988).

    Google Scholar 

  5. B. R. Rau and C. D. Glaeser, Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing, Proc. 14th Ann. Workshop of Microprogramming, pp. 183–198 (October 1981).

  6. B. R. Rau, Iterative modulo scheduling, IJPP, 24(1):3-64 (1996).

    Google Scholar 

  7. R. A. Huff, Lifetime-sensitive modulo scheduling, Conf. Progr. Lang. Design and Implementation (PLDI'93), ACM, pp. 258–267 (1993).

  8. J. Llosa, A. González, E. Ayguadé, and M. Valero, Swing modulo scheduling: A lifetime-sensitive approach, Conf. Parallel Architectures and Compilation Techniques (PACT'96), IEEE Computer Society Press, Boston, Massachusetts (1996).

    Google Scholar 

  9. Alexander Aiken and Alexandru Nicolau, Perfect pipelining: A new loop optimization technique, European Symp. Programming, Vol. 300, Lecture Notes in Computer Science, Springer-Verlag, pp. 221–235 (1988).

  10. M. Rajagopalan and V. H. Allan, Specification of software pipelining using petri nets, IJPP, 22(3):273–301 (1994).

    Google Scholar 

  11. Suneel Jain, Circular scheduling, Conf. Progr. Lang. Design and Implementation (PLDI'91), ACM, pp. 219–228 (1991).

  12. Soo-Mook Moon and Kemal Ebcioğlu, An efficient resource-constrained global scheduling technique for superscalar and VLIW processors, 25th Ann. Int'l. Symp. Microarchitecture, pp. 55–71 (1992).

  13. L.-F. Chao, A. LaPaugh, and E. Sha, Rotation scheduling: A loop pipelining algorithm, 30th ACM-IEEE Design Automation Conf., pp. 566–572 (1993).

  14. F. Gasperoni and U. Schwiegelshohn, Generating close to optimum loop schedules on parallel processors, Parallel Proc. Lett., 4(4):391–403 (1994).

    Google Scholar 

  15. J. Wang, C. Eisenbeis, M. Jourdan, and B. Su, Decomposed software pipelining, IJPP, 22(3):351–373 (1994).

    Google Scholar 

  16. P.-Y. Calland, A. Darte, and Y. Robert, Circuit retiming applied to decomposed software pipelining, IEEE Trans. Parallel Distrib. Syst., 9(1):24–35 (January 1998).

    Google Scholar 

  17. U. Schwiegelshohn, F. Gasperoni, and K. Ebcioğlu, On optimal parallelization of arbitrary loops, Journal of Parallel and Distributed Computing, 11:130–134 (1991).

    Google Scholar 

  18. M. Gondran and M. Minoux, Graphs and Algorithms, John Wiley (1984).

  19. C. Hanen and A. Munier, Cyclic scheduling on parallel processors: An overview. In P. Chrétienne, E. G. Coffman, Jr., J. K. Lenstra, and Z. Liu (eds.), Scheduling Theory and Its Applications, John Wiley (1995).

  20. E. G. Coffman, Jr., Computer and Job-Shop Scheduling Theory, John Wiley (1976).

  21. Myricom, Inc. LANai 3.0 instruction set. Electronic document http://www.myricom.com/scs/L3/doc/inst_toc.html.

  22. Ping Hu, Ordonnancement modulo par recouvrement, 10è me Rencontres du Parallélisme (RenPar'10), Strasbourg, France (June 1998).

  23. C. E. Leiserson and J. B. Saxe, Retiming synchronous circuitry, Algorithmica, 6(1):5–35 (1991).

    Google Scholar 

  24. Tsing-Fa Lee Allen, C.-H. Wu Wei-Jeng Chen, Wei-Kai Cheng, and Youn-Long Lin, On the relationship between sequential logic retiming and loop folding, Proc. SASIMI'93, Nara, Japan, pp. 384–393 (October 1993).

  25. Alain Darte, Georges-André Silber, and Frédéric Vivien, Combining retiming and scheduling techniques for loop parallelization and loop tiling, Parallel Proc. Lett., 7(4):379–392 (1997).

    Google Scholar 

  26. F. Gasperoni and U. Schwiegelshohn, Transforming cyclic scheduling problems into acyclic ones. In P. Chrétienne, E. G. Coffman, Jr., J. K. Lenstra, and Z. Liu (eds.), Scheduling Theory and Its Applications, John Wiley, pp. 241–258 (1995).

  27. Trimaran, An infrastructure for research in instruction level parallelism. Electronic document http://www.trimaran.org.

  28. Salto, Salto: System of assembly language transformation and optimization. Electronic document http://www.irisa.fr/caps/projects/Salto/.

  29. A. Eichenberger, E. S. Davidson, and S. G. Abraham, Minimum register requirements for a modulo schedule, Proc. 27th Int'l. Symp. Microarchitecture, San Jose, California, pp. 75–84 (1994).

  30. Antoine Sawaya, Pipeline logiciel: Découplage et contraintes de registres, Ph.D. thesis, Université de Versailles, France (1997).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Darte, A., Huard, G. Loop Shifting for Loop Compaction. International Journal of Parallel Programming 28, 499–534 (2000). https://doi.org/10.1023/A:1007506711786

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007506711786

Navigation