Abstract
A memory consistency model (MCM) is the part of a programming language or computer architecture specification that defines which values can legally be read from shared memory locations. Because MCMs take into account various optimisations employed by architectures and compilers, they are often complex and counterintuitive, which makes them challenging to design and to understand.
We identify four tasks involved in designing and understanding MCMs: generating conformance tests, distinguishing two MCMs, checking compiler optimisations, and checking compiler mappings. We show that all four tasks are instances of a general constraint-satisfaction problem to which the solution is either a program or a pair of programs. Although this problem is intractable for automatic solvers when phrased over programs directly, we show how to solve analogous constraints over program executions, and then construct programs that satisfy the original constraints.
Our technique, which is implemented in the Alloy modelling framework, is illustrated on several software- and architecture-level MCMs, both axiomatically and operationally defined. We automatically recreate several known results, often in a simpler form, including: distinctions between variants of the C11 MCM; a failure of the "SC-DRF guarantee" in an early C11 draft; that x86 is "multi-copy atomic" and Power is not; bugs in common C11 compiler optimisations; and bugs in a compiler mapping from OpenCL to AMD-style GPUs. We also use our technique to develop and validate a new MCM for NVIDIA GPUs that supports a natural mapping from OpenCL.
Supplemental Material
Available for Download
This repository contains materials for recreating and building upon our results. We provide the source code of our Alloy models, instructions for reproducing our performance results (Table 2), and the data underlying the experimental testing of our new PTX MCM.
- Supplementary material for this paper is available in the ACM digital library, and in the following GitHub repository.Google Scholar
- Jean-Raymond Abrial, Michael Butler, Stefan Hallerstede, Thai Son Hoang, Farhad Mehta, and Laurent Voisin. “Rodin: an open toolset for modelling and reasoning in Event-B”. In: Int. J. Softw. Tools Technol. Transfer 12 (2010). Google ScholarDigital Library
- Sarita V. Adve. “Designing Memory Consistency Models For Shared-Memory Multiprocessors”. PhD thesis. University of Wisconsin-Madison, 1993. Google ScholarDigital Library
- Sarita V. Adve and Kourosh Gharachorloo. “Shared memory consistency models: A tutorial”. In: IEEE Computer 29.12 (1996). Google ScholarDigital Library
- Sarita V. Adve and Mark D. Hill. “Weak Ordering - A New Definition”. In: Int. Symp. on Computer Architecture (ISCA). 1990. Google ScholarDigital Library
- Mustaque Ahamad, Rida A. Bazzi, Ranjit John, Prince Kohli, and Gil Neiger. “The Power of Processor Consistency”. In: ACM Symp. on Parallelism in Algorithms and Architectures (SPAA). 1993. Google ScholarDigital Library
- Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, & Tools. Second edition. Addison-Wesley, 2006. Google ScholarDigital Library
- Jade Alglave. “A formal hierarchy of weak memory models”. In: Formal Methods in System Design 41 (2012). Google ScholarDigital Library
- Jade Alglave, Mark Batty, Alastair F. Donaldson, Ganesh Gopalakrishnan, Jeroen Ketema, Daniel Poetzl, Tyler Sorensen, and John Wickerson. “GPU concurrency: weak behaviours and programming assumptions”. In: Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 2015. Google ScholarDigital Library
- Jade Alglave, Daniel Kroening, and Michael Tautschnig. “Partial Orders for Efficient Bounded Model Checking of Concurrent Software”. In: Computer Aided Verification (CAV). 2013.Google ScholarCross Ref
- Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. “Fences in Weak Memory Models”. In: Computer Aided Verification (CAV). 2010. Google ScholarDigital Library
- Jade Alglave, Luc Maranget, and Michael Tautschnig. “Herding cats: modelling, simulation, testing, and data-mining for weak memory”. In: ACM Trans. on Programming Languages and Systems (TOPLAS) 36.2 (2014). Google ScholarDigital Library
- Gilles Audemard and Laurent Simon. “Predicting Learnt Clauses Quality in Modern SAT Solvers”. In: Int. Joint Conf. on Artificial Intelligence (IJCAI). 2009. Google ScholarDigital Library
- Mark Batty, Alastair F. Donaldson, and John Wickerson. “Overhauling SC atomics in C11 and OpenCL”. In: ACM Symp. on Principles of Programming Languages (POPL). 2016. Google ScholarDigital Library
- Mark Batty, Kayvan Memarian, Scott Owens, Susmit Sarkar, and Peter Sewell. “Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER”. In: ACM Symp. on Principles of Programming Languages (POPL). 2012. Google ScholarDigital Library
- Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. “Mathematizing C++ Concurrency”. In: ACM Symp. on Principles of Programming Languages (POPL). 2011. Google ScholarDigital Library
- Armin Biere. Lingeling, Plingeling, PicoSAT and PrecoSAT at SAT Race 2010. Tech. rep. 10/1. Institute for Formal Models and Verification, Johannes Kepler University, 2010.Google Scholar
- Jasmin Christian Blanchette, Tjark Weber, Mark Batty, Scott Owens, and Susmit Sarkar. “Nitpicking C++ Concurrency”. In: Int. Symp. on Principles and Practice of Declarative Programming (PPDP). 2011. Google ScholarDigital Library
- Hans-J. Boehm and Sarita V. Adve. “Foundations of the C++ Concurrency Memory Model”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2008. Google ScholarDigital Library
- James Bornholt and Emina Torlak. Synthesizing Memory Models from Litmus Tests. Tech. rep. UW-CSE-16-10-01. University of Washington, 2016.Google Scholar
- Sebastian Burckhardt, Madanlal Musuvathi, and Vasu Singh. “Verifying Local Transformations on Relaxed Memory Models”. In: Int. Conf. on Compiler Construction (CC). 2010. Google ScholarDigital Library
- Soham Chakraborty and Viktor Vafeiadis. “Validating Optimizations of Concurrent C/C++ Programs”. In: Int. Symp. on Code Generation and Optimization (CGO). 2016. Google ScholarDigital Library
- William W. Collier. Reasoning about Parallel Architectures. Prentice Hall, 1992. Google ScholarDigital Library
- Ashish Darbari, Iain Singleton, Michael Butler, and John Colley. “Formal Modelling, Testing and Verification of HSA Memory Models using Event-B”. Draft. 2016.Google Scholar
- Brian Demsky and Patrick Lam. “SATCheck: SAT-Directed Stateless Model Checking for SC and TSO”. In: ACM Int. Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 2015. Google ScholarDigital Library
- Niklas Eén and Niklas Sörensson. “An Extensible SATsolver”. In: Theory and Applications of Satisfiability Testing (SAT). 2003.Google Scholar
- Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. “Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA”. In: ACM Symp. on Principles of Programming Languages (POPL). 2016. Google ScholarDigital Library
- Kourosh Gharachorloo. “Memory Consistency Models for Shared-Memory Multiprocessors”. PhD thesis. Stanford University, 1995. Google ScholarDigital Library
- Blake A. Hechtman, Shuai Che, Derek R. Hower, Yingying Tian, Bradford M. Beckmann, Mark D. Hill, Steven K. Reinhardt, and David A. Wood. “QuickRelease: A Throughputoriented Approach to Release Consistency on GPUs”. In: IEEE Int. Symp. on High Performance Computer Architecture (HPCA). 2014.Google ScholarCross Ref
- John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach. Fifth edition. Morgan Kaufmann, 2012. Google ScholarDigital Library
- Lisa Higham, LillAnne Jackson, and Jalal Kawash. “Specifying Memory Consistency of Write Buffer Multiprocessors”. In: ACM Trans. on Programming Languages and Systems (TOPLAS) 25.1 (2007). Google ScholarDigital Library
- Lisa Higham, Jalal Kawash, and Nathaly Verwaal. “Defining and Comparing Memory Consistency Models”. In: Int. Conf. on Parallel and Distributed Computing Systems (PDCS). 1997.Google Scholar
- C. A. R. Hoare. “Proof of Correctness of Data Representations”. In: Acta Informatica 1 (1972). Google ScholarDigital Library
- Derek R. Hower, Blake A. Hechtman, Bradford M. Beckmann, Benedict R. Gaster, Mark D. Hill, Steven K. Reinhardt, and David A. Wood. “Heterogeneous-race-free Memory Models”. In: Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 2014. Google ScholarDigital Library
- HSA Foundation. HSA Platform System Architecture Specification. Version 1.0, 2015.Google Scholar
- IBM. Power ISA (Version 2.06B). 2010.Google Scholar
- ISO/IEC. Programming languages – C++. Draft N3092, 2010.Google Scholar
- ISO/IEC. Programming languages – C. International standard 9899:2011, 2011.Google Scholar
- Daniel Jackson. Software Abstractions – Logic, Language, and Analysis. Revised edition. MIT Press, 2012. Google ScholarDigital Library
- Alan Jeffrey and James Riely. “On Thin Air Reads: Towards an Event Structures Model of Relaxed Memory”. In: ACM/IEEE Symp. on Logic in Computer Science (LICS). 2016. Google ScholarDigital Library
- Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. “A Promising Semantics for Relaxed-Memory Concurrency”. In: ACM Symp. on Principles of Programming Languages (POPL). 2017. Google ScholarDigital Library
- Khronos Group. The OpenCL Specification. Version 2.0, 2013.Google Scholar
- Ori Lahav, Nick Giannarakis, and Viktor Vafeiadis. “Taming Release-Acquire Consistency”. In: ACM Symp. on Principles of Programming Languages (POPL). 2016. Google ScholarDigital Library
- Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. Repairing Sequential Consistency in C/C++11. Tech. rep. MPI-SWS-2016-011. MPI-SWS, 2016.Google Scholar
- Leslie Lamport. “How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs”. In: IEEE Transactions on Computers C-28.9 (1979). Google ScholarDigital Library
- Daniel Lustig, Michael Pellauer, and Margaret Martonosi. “PipeCheck: Specifying and Verifying Microarchitectural Enforcement of Memory Consistency Models”. In: Int. Symp. on Microarchitecture (MICRO). 2014. Google ScholarDigital Library
- Daniel Lustig, Caroline Trippel, Michael Pellauer, and Margaret Martonosi. “ArMOR: Defending Against Memory Consistency Model Mismatches in Heterogeneous Architectures”. In: Int. Symp. on Computer Architecture (ISCA). 2015. Google ScholarDigital Library
- Sela Mador-Haim, Rajeev Alur, and Milo M. K. Martin. “Generating Litmus Tests for Contrasting Memory Consistency Models”. In: Computer Aided Verification (CAV). 2010. Google ScholarDigital Library
- Sela Mador-Haim, Rajeev Alur, and Milo M. K. Martin. “Litmus Tests for Comparing Memory Consistency Models: How Long Do They Need to Be?” In: Design Automation Conference (DAC). 2011. Google ScholarDigital Library
- Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, and Derek Williams. “An Axiomatic Memory Model for POWER Multiprocessors”. In: Computer Aided Verification (CAV). 2012. Google ScholarDigital Library
- Yatin A. Manerkar, Caroline Trippel, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. “Counterexamples and Proof Loophole for the C/C++ to POWER and ARMv7 Trailing-Sync Compiler Mappings”. 2016.Google Scholar
- Aleksandar Milicevic, Joseph P. Near, Eunsuk Kang, and Daniel Jackson. “Alloy*: A General-Purpose Higher-Order Relational Constraint Solver”. In: Int. Conf. on Software Engineering (ICSE). 2015. Google ScholarDigital Library
- Rolf H. Möhring. “Computationally tractable classes of ordered sets”. In: Algorithms and Order. Ed. by Ivan Rival. Springer, 1989.Google ScholarCross Ref
- Lee Momtahan. “Towards a Small Model Theorem for Data Independent Systems in Alloy”. In: Electronic Notes in Theoretical Computer Science 128 (2005). Google ScholarDigital Library
- Robin Morisset, Pankaj Pawan, and Francesco Zappa Nardelli. “Compiler Testing via a Theory of Sound Optimisations in the C11/C++11 Memory Model”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2013. Google ScholarDigital Library
- Dominic P. Mulligan, Scott Owens, Kathryn E. Gray, Tom Ridge, and Peter Sewell. “Lem: reusable engineering of realworld semantics”. In: ACM Int. Conf. on Functional Programming (ICFP). 2014. Google ScholarDigital Library
- Roger M. Needham. “Names”. In: Distributed Systems. Ed. by Sape Mullender. ACM Press, 1989. Google ScholarDigital Library
- Kyndylan Nienhuis, Kayvan Memarian, and Peter Sewell. “An Operational Semantics for C/C++11 Concurrency”. In: ACM Int. Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 2016. Google ScholarDigital Library
- Mladen Nikoli´c. “Statistical Methodology for Comparison of SAT Solvers”. In: Theory and Applications of Satisfiability Testing (SAT). 2010. Google ScholarDigital Library
- Brian Norris and Brian Demsky. “CDSChecker: Checking Concurrent Data Structures Written with C/C++ Atomics”. In: ACM Int. Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 2013. Google ScholarDigital Library
- NVIDIA. Parallel Thread Execution ISA, version 4.3. 2015.Google Scholar
- Marc S. Orr, Shuai Che, Ayse Yilmazer, Bradford M. Beckmann, Mark D. Hill, and David A. Wood. “Synchronization Using Remote-Scope Promotion”. In: Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 2015. Google ScholarDigital Library
- Scott Owens, Susmit Sarkar, and Peter Sewell. “A Better x86 Memory Model: x86-TSO”. In: Theorem Proving in Higher Order Logics (TPHOLs). 2009. Google ScholarDigital Library
- Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. “Ivy: Safety Verification by Interactive Generalization”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2016. Google ScholarDigital Library
- Jean Pichon-Pharabod and Peter Sewell. “A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids Thin-Air Executions”. In: ACM Symp. on Principles of Programming Languages (POPL). 2016. Google ScholarDigital Library
- Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. “Synchronising C/C++ and POWER”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2012. Google ScholarDigital Library
- Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. “Understanding POWER Multiprocessors”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2011. Google ScholarDigital Library
- Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave. “The Semantics of x86-CC Multiprocessor Machine Code”. In: ACM Symp. on Principles of Programming Languages (POPL). 2009. Google ScholarDigital Library
- Ali Sezgin. “Formalization and Verification of Shared Memory”. PhD thesis. University of Utah, 2004. Google ScholarDigital Library
- Dennis Shasha and Marc Snir. “Efficient and Correct Execution of Parallel Programs that Share Memory”. In: ACM Trans. on Programming Languages and Systems (TOPLAS) 10.2 (1988). Google ScholarDigital Library
- Tyler Sorensen and Alastair F. Donaldson. “Exposing Errors Related to Weak Memory in GPU Applications”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2016. Google ScholarDigital Library
- Daniel J. Sorin, Mark D. Hill, and David A. Wood. A Primer on Memory Consistency and Cache Coherence. Ed. by Mark D. Hill. Vol. 16. Synthesis Lectures on Computer Architecture. Morgan & Claypool, 2011. Google ScholarDigital Library
- Alfred Tarski. “On the Calculus of Relations”. In: Journal of Symbolic Logic 6.3 (1941), pp. 73–89.Google ScholarCross Ref
- Emina Torlak and Rastislav Bodik. “Growing Solver-Aided Languages with Rosette”. In: Onward! 2013. Google ScholarDigital Library
- Emina Torlak, Mandana Vaziri, and Julian Dolby. “Mem-SAT: Checking Axiomatic Specifications of Memory Models”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2010. Google ScholarDigital Library
- Caroline Trippel, Yatin A. Manerkar, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. “Exploring the Trisection of Software, Hardware, and ISA in Memory Model Design”. Draft. 2016.Google Scholar
- Viktor Vafeiadis, Thibaut Balabonski, Soham Chakraborty, Robin Morisset, and Francesco Zappa Nardelli. “Common compiler optimisations are invalid in the C11 memory model and what we can do about it”. In: ACM Symp. on Principles of Programming Languages (POPL). 2015. Google ScholarDigital Library
- Jacobo Valdes, Robert E. Tarjan, and Eugene L. Lawler. “The recognition of Series Parallel digraphs”. In: ACM Symp. on Theory of Computing (STOC). 1979. Google ScholarDigital Library
- Jaroslav Ševˇcík. “Safe Optimisations for Shared-Memory Concurrent Programs”. In: ACM Conf. on Programming Language Design and Implementation (PLDI). 2011. Google ScholarDigital Library
- Jaroslav Ševˇcík and David Aspinall. “On Validity of Program Transformations in the Java Memory Model”. In: Europ. Conf. on Object-Oriented Programming (ECOOP). 2008. Google ScholarDigital Library
- Jaroslav Ševˇcík, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, and Peter Sewell. “Relaxed-Memory Concurrency and Verified Compilation”. In: ACM Symp. on Principles of Programming Languages (POPL). 2011. Google ScholarDigital Library
- John Wickerson, Mark Batty, Alastair F. Donaldson, and Bradford M. Beckmann. “Remote-scope promotion: clarified, rectified, and verified”. In: ACM Int. Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 2015. Google ScholarDigital Library
- Yue Yang, Ganesh Gopalakrishnan, Gary Lindstrom, and Konrad Slind. “Nemos: A Framework for Axiomatic and Executable Specifications of Memory Consistency Models”. In: Int. Parallel and Distributed Processing Symp. (IPDPS). 2004.Google ScholarCross Ref
Index Terms
- Automatically comparing memory consistency models
Recommendations
Automatically comparing memory consistency models
POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming LanguagesA memory consistency model (MCM) is the part of a programming language or computer architecture specification that defines which values can legally be read from shared memory locations. Because MCMs take into account various optimisations employed by ...
Comparing Hardware Accelerators in Scientific Applications: A Case Study
Multicore processors and a variety of accelerators have allowed scientific applications to scale to larger problem sizes. We present a performance, design methodology, platform, and architectural comparison of several application accelerators executing ...
Remote-scope promotion: clarified, rectified, and verified
OOPSLA '15Modern accelerator programming frameworks, such as OpenCL, organise threads into work-groups. Remote-scope promotion (RSP) is a language extension recently proposed by AMD researchers that is designed to enable applications, for the first time, both to ...
Comments