A thread partitioning approach for speculative multithreading

Liu, Bin; Zhao, Yinliang; Li, Yuxiang; Sun, Yanjun; Feng, Boqin

doi:10.1007/s11227-013-1000-1

A thread partitioning approach for speculative multithreading

Published: 09 August 2013

Volume 67, pages 778–805, (2014)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Bin Liu¹,
Yinliang Zhao¹,
Yuxiang Li¹,
Yanjun Sun¹ &
…
Boqin Feng¹

464 Accesses
19 Citations
Explore all metrics

Abstract

Speculative multithreading (SpMT) is a thread-level automatic parallelization technique, which partitions sequential programs into multithreads to be executed in parallel. This paper presents different thread partitioning strategies for nonloops and loops. For nonloops, we propose a cost estimation based on combined run-time effects of various speculation factors to predict the resulting performance of candidate threads to guide the thread partitioning. For loops, we parallelize all the profitable loops that can potentially offer additional performance benefits by multilevel spawning in loop bodies, loop iterations, and inner loops. Then we select a proper thread boundary located in the front of loop branch instruction to reduce invalid spawning threads that waste core resources. Experimental results show that the proposed approach can obtain a significant increase in speedup and Olden benchmarks reach a performance improvement of 6.62 % on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

Article 17 November 2014

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

Article 08 January 2016

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Article Open access 11 March 2024

References

Bhowmik A, Franklin M (2002) A general compiler framework for speculative multithreading. In: Proceedings of the fourteenth annual ACM symposium on parallel algorithms and architectures, SPAA’02. ACM, New York, pp 99–108
Chapter Google Scholar
Carlisle MC, Rogers A (1995) Software caching and computation migration in olden. SIGPLAN Not 30(8):29–38
Article Google Scholar
Chen M, Olukotun K (2003) Test: a tracer for extracting speculative threads. In: Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO’03. IEEE Computer Society, Washington, pp 301–312
Google Scholar
Chen Z, Zhao YL, Pan XY, Dong ZY, Gao B, Zhong ZW (2009) An overview of prophet. In: Proceedings of the 9th international conference on algorithms and architectures for parallel processing, ICA3PP’09. Springer, Berlin, pp 396–407
Chapter Google Scholar
Dong Z, Zhao Y, Wei Y, Wang X, Song S (2009) Prophet: a speculative multi-threading execution model with architectural support based on cmp. In: Proceedings of the 2009 international conference on scalable computing and communications; eighth international conference on embedded computing, SCALCOM-EMBEDDEDCOM’09. IEEE Computer Society, Washington, pp 103–108
Chapter Google Scholar
Dou J, Cintra M (2007) A compiler cost model for speculative parallelization. ACM Trans Archit Code Optim 4(2)
Du ZH, Lim CC, Li XF, Yang C, Zhao Q, Ngai TF (2004) A cost-driven compilation framework for speculative parallelization of sequential programs. SIGPLAN Not 39(6):71–81
Article Google Scholar
Gao L, Xue J, Ngai TF (2010) Loop recreation for thread-level speculation on multicore processors. Softw Pract Exp 40(1):45–72
Google Scholar
Gao L, Li L, Xue J, Yew PC (2013) Seed: a statically greedy and dynamically adaptive approach for speculative loop execution. IEEE Trans Comput 62(5):1004–1016
Article MathSciNet Google Scholar
Hammond L, Hubbert BA, Siu M, Prabhu MK, Chen M, Olukotun K (2000) The stanford hydra cmp. IEEE MICRO 20(2):71–84
Article Google Scholar
Huang J, Jablin TB, Beard SR, Johnson NP, August DI (2013) Automatically exploiting cross-invocation parallelism using runtime information. pp. ACM SIGMICRO; ACM SIGPLAN; IEEE Computer Society TC–uARCH; IEEE Computer Society; Association for Computing Machinery (ACM), Shenzhen, China
Johnson TA, Eigenmann R, Vijaykumar TN (2004) Min-cut program decomposition for thread-level speculation. SIGPLAN Not 39(6):59–70
Article Google Scholar
Johnson TA, Eigenmann R, Vijaykumar TN (2007) Speculative thread decomposition through empirical optimization. In: Proceedings of the 12th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP’07. ACM, New York, pp 205–214
Chapter Google Scholar
Li Y, Zhao Y, Li M, Zhao Y (2010) A cost estimation based speculative path determination method for speculative thread partitioning. In: 2010 international conference on computer application and system modeling (ICCASM), vol 3, pp 663–668
Google Scholar
Liu W, Tuck J, Ceze L, Ahn W, Strauss K, Renau J, Torrellas J (2006) Posh: a tls compiler that exploits program structure. In: Proceedings of the eleventh ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP’06. ACM, New York, pp 158–167
Chapter Google Scholar
Long S, Fursin G, Franke B (2007) A cost-aware parallel workload allocation approach based on machine learning techniques. In: Proceedings of the 2007 IFIP international conference on network and parallel computing, NPC’07. Springer, Berlin, pp 506–515
Google Scholar
Luo Y, Zhai A (2012) Dynamically dispatching speculative threads to improve sequential execution. ACM Trans Archit Code Optim 9(3):13
Article Google Scholar
Madriles C, García-Quiñones C, Sánchez J, Marcuello P, González A, Tullsen DM, Wang H, Shen JP (2008) Mitosis: a speculative multithreaded processor based on precomputation slices. IEEE Trans Parallel Distrib Syst 19(7):914–925
Article Google Scholar
Ohsawa T, Takagi M, Kawahara S, Matsushita S (2005) Pinot: speculative multi-threading processor architecture exploiting parallelism over a wide range of granularities. In: Proceedings of the 38th annual IEEE/ACM international symposium on microarchitecture, MICRO 38. IEEE Computer Society, Washington, pp 81–92
Google Scholar
Olukotun K, Hammond L, Willey M (1999) Improving the performance of speculatively parallel applications on the hydra cmp. In: Proceedings of the 13th international conference on supercomputing, ICS’99. ACM, New York, pp 21–30
Chapter Google Scholar
Ootsu K, Abe T, Yokota T, Baba T (2010) Loop performance improvement for min-cut program decomposition method. In: ICNC, pp 78–87
Google Scholar
Pan XY, Zhao Y, Chen Z, Wang X, Wei Y, Du Y (2009) A thread partitioning method for speculative multithreading. In: ScalCom-EmbeddedCom, pp 285–290
Google Scholar
Prabhu MK, Olukotun K (2003) Using thread-level speculation to simplify manual parallelization. SIGPLAN Not 38(10):1–12
Article Google Scholar
Quiñones CG, Madriles C, Sánchez J, Marcuello P, González A, Tullsen DM (2005) Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. SIGPLAN Not 40(6):269–279
Article Google Scholar
Sarkar V, Hennessy J (1986) Partitioning parallel programs for macro-dataflow. In: Proceedings of the 1986 ACM conference on LISP and functional programming, LFP’86. ACM, New York, pp 202–211
Chapter Google Scholar
Sharafeddine M, Jothi K, Akkary H (2012) Disjoint out-of-order execution processor. ACM Trans Archit Code Optim 9(3):19:1–19:32
Article Google Scholar
Smith JE, Vajapeyam S (1997) Trace processors: moving to fourth-generation microarchitectures. Computer 30(9):68–74
Article Google Scholar
Sohi GS, Breach SE, Vijaykumar TN (1995) Multiscalar processors. SIGARCH Comput Archit News 23(2):414–425
Article Google Scholar
Sun XH, Chen Y (2010) Reevaluating Amdahl’s law in the multicore era. J Parallel Distrib Comput 70(2):183–188
Article MATH Google Scholar
Tang X, Wang J, Theobald KB, Gao GR (1997) Thread partitioning and scheduling based on cost model. In: Proceedings of the ninth annual ACM symposium on parallel algorithms and architectures, SPAA’97. ACM, New York, pp 272–281
Chapter Google Scholar
Vijaykumar TN, Sohi GS (1998) Task selection for a multiscalar processor. In: Proceedings of the 31st annual ACM/IEEE international symposium on microarchitecture, MICRO 31. IEEE Computer Society, Los Alamitos, pp 81–92
Chapter Google Scholar
Wang S, Dai X, Yellajyosula KS, Zhai A, Yew PC (2006) Loop selection for thread-level speculation. In: Proceedings of the 18th international conference on languages and compilers for parallel computing, LCPC’05. Springer, Berlin, pp 289–303
Chapter Google Scholar
Wang S, Yew PC, Zhai A (2012) Code transformations for enhancing the performance of speculatively parallel threads. 5 Toh Tuck Link, Singapore, 596224
Wilson R, French R, Wilson C, Amarasinghe S, Anderson J, Tjiang S, Liao S, Tseng C, Hall M, Lam M, Hennessy J (1994) The suif compiler system: a parallelizing and optimizing research compiler. Tech rep, Stanford, CA, USA
Zhai A, Colohan CB, Steffan JG, Mowry TC (2004) Compiler optimization of memory-resident value communication between speculative threads. In: Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO’04. IEEE Computer Society, Washington, pp 39–51
Google Scholar
Zheng B, Tsai JY, Zhang BY, Chen T, Huang B, Li JH, Ding YH, Liang J, Zhen Y, Yew PC, Zhu CQ (2000) Designing the Agassiz compiler for concurrent multithreaded architectures. In: Proceedings of the 12th international workshop on languages and compilers for parallel computing, LCPC’99. Springer, London, pp 380–398
Chapter Google Scholar
Zhou H (2005) Dual-core execution: building a highly scalable single-thread instruction window. In: Proceedings of the 14th international conference on parallel architectures and compilation techniques, PACT’05. IEEE Computer Society, Washington, pp 231–242
Chapter Google Scholar

Download references

Acknowledgements

We thank our colleagues for their collaboration and for providing the directions for the present work. We also thank all the reviewers for their specific comments and suggestions. This work is supported by National Natural Science Foundation of China through grants No. 61173040 and the National High Technology Research and Development Program of China under Grant No. 2012AA011003.

Author information

Authors and Affiliations

Department of Computer Science, Xi’an Jiaotong University, Xi’an, 710049, P.R. China
Bin Liu, Yinliang Zhao, Yuxiang Li, Yanjun Sun & Boqin Feng

Authors

Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yinliang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuxiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yanjun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Boqin Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, B., Zhao, Y., Li, Y. et al. A thread partitioning approach for speculative multithreading. J Supercomput 67, 778–805 (2014). https://doi.org/10.1007/s11227-013-1000-1

Download citation

Published: 09 August 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s11227-013-1000-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A thread partitioning approach for speculative multithreading

Abstract

Access this article

Similar content being viewed by others

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A thread partitioning approach for speculative multithreading

Abstract

Access this article

Similar content being viewed by others

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation