Skip to main content
Log in

CAS-DSM: A Compiler Assisted Software Distributed Shared Memory

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Traditional software Distributed Shared Memory (DSM) systems rely on the virtual memory management mechanisms to detect accesses to shared memory locations and maintain their consistency. The resulting involvement of the OS (kernel) and the associated overhead which is significant, can be avoided by careful compile time analysis and code instrumentation. In this paper, we propose such a Compiler Assisted Software support approach (CAS-DSM). In the CAS-DSM implementation, the involvement of the OS kernel is avoided by instrumenting the application code at the source level. The overhead caused by the execution of the instrumented code is reduced through several aggressive compile time optimizations. Finally, we also address the issue of reducing certain overheads in polling-based implementation of receiving asynchronous messages. We used SUIF, a public domain compiler tool, to implement compile time analysis, instrumentation and optimizations. We modified CVM, a publicly available software DSM to support the instrumentation inserted by the compiler. Detailed performance evaluation of CAS-DSM is reported using a set of Splash/Splash2 parallel application benchmarks on a distributed memory IBM SP-2 machine. CAS-DSM achieved moderate to good performance improvements for most of the applications compared to the original CVM implementation. Reducing the overheads in polling-based implementation improves the performance of CAS-DSM significantly resulting in an overall improvement of 12–52% over the original CVM implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, 2nd Ed., Morgan Kaufmann Publishers, San Francisco, CA (1996).

    Google Scholar 

  2. D. Lenoski, J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam, The Stanford DASH Multiprocessor, Computer, 25(3):63-79 (March 1992).

    Google Scholar 

  3. D. V. James, A. T. Laundrie, S. Gjessing, and G. S. Sohi, Distributed-Directory Scheme: Scalable Coherent Interface, IEEE Comput., 74-77 (June 1990).

  4. E. Hagersten, A. Landin, and S. Haridi, DDM—a Cache-Only Memory Architecture, Computer, 25(9):44-54 (September 1992).

    Google Scholar 

  5. S. Frank, KSR1: High Performance and Ease of Programming, No Longer an Oxymoron, Proc. of the 5th Ann. ACM Symp. on Parallel Algorithms and Architectures, Velen, Germany, p. 335(June 30–July 2, 1993).

    Google Scholar 

  6. Sarita V. Adve and Kourosh Gharachorloo, Shared Memory Consistency Models: A Tutorial, IEEE Comput., 66-76 (December 1996).

  7. V. S. Sunderam, PVM: A Framework for Parallel Distributed Computing, IEEE Concurrency: Practice and Experience, 2(4):315-339 (December 1990).

    Google Scholar 

  8. W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface, MIT Press (1994).

  9. K. Li. Ivy, A Shared Virtual Memory System for Parallel Computing, International Conference on Parallel Processing, pp. 94-101 (1988).

  10. C. Amza, A. L. Cox, S. Dwarakdas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, Treadmarks: Shared Memory Computing on Network of Workstations, IEEE Computer, 18-28 (February 1996).

  11. J. K. Bennett, J. B. Carter, and W. Zwaenepoel, Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence, Proc. of the Second ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, Seattle, WA, pp. 168-176 (March 14–16, 1990).

  12. B. N. Bershad and M. J. Zekauskas, Midway: Shared Memory Parallel Programming with Entry Consistency for Distributed Memory Multiprocessors, Technical report, School of Computer Science, Carnegie Mellon University, Pisstburgh, PA 15213 (1991).

    Google Scholar 

  13. M. J. Zekauskas, W. A. Sawdon, and B. N. Bershad, Software Write Detection for a Distributed Shared Memory, The Symposium on Operating Systems Design and Implementation (OSDI), pp. 87-100 (November 1994).

  14. P. Keleher, CVM: The Coherent Virtual Machine, University of Maryland, College Park, MD (July 1997). http://www.cs.umd.edu/projects/cvm

    Google Scholar 

  15. D. J. Scales, K. Gharachorloo, and C. A. Thekkath, Shasta: A low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory, Proceedings of Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 174-185 (October 1996).

  16. P. Keleher, Tapeworm: High Level Abstractions of Shared Access, Proc. of the 3rd Symp. on Operating System Design and Implementation, pp. 201-214 (February 1999).

  17. A. Itzkovitz and A. Schuster, Multiview and Millipage: Fine Grain Sharing in Page Based DSMs, Proc. of the 3rd Symp. on Operating System Design and Implementation, pp. 215-228 (February 1999).

  18. H. Lu, S. Dwarkadas, A. L. Cox, and W. Zwaenepoel, Message Passing Versus Distributed Shared Memory on Networks of Workstations, Proc. of the Supercomputing Conference SC-95 (December 1995).

  19. OpenMP. http://www.openmp.org.

  20. H. Lu, C. Lu, and W. Zwaenepoel, OpenMP on networks of workstations, Proc. of the Supercomputing Conference SC-98 (November 1998).

  21. K. Kusano, M. Sato, T. Hosomi, and Y. Seo, The Omni OpenMP Compiler on the Distributed Shared Memory of Cenju-4, Proc. of the Intl. Workshop on OpenMP Applications and Tools, WOMPAT 2001, West Lafayette, IN, USA (July 2001).

  22. S. C. Woo, M. Ohara, E. Torrie, J. Pal Shingh, and A. Gupta, The SPLASH-2 Programs: Characterization and Methodological Considerations, Proc. of the 22nd Ann. Intl. Symp. on Computer Architecture, Santa Margherita Ligure, Italy, pp. 24-36 (June 22–24, 1995).

  23. K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy, Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors, Proc. of the 17th Ann. Intl. Symp. on Computer Architecture, Seattle, WA, pp. 15-26 (May 28–31, 1990).

  24. J. R. Goodman, Cache Consistency and Sequential Consistency, Technical Report 1006, Dep. of Comp. Sci., University of Wisconsin, Madison (February 1991).

    Google Scholar 

  25. M. Dubois, C. Scheurich, and F. Briggs, Memory Access Buffering in Multiprocessors, Proc. of the 13th Ann. Intl. Symp. on Computer Architecture, Tokyo, Japan, pp. 434-442 (June 2–5, 1986).

  26. P. Keleher, A. L. Cox, and W. Zwaenepoel, Lazy Release Consistency for Software Distributed Shared Memory, Proc. of the 19th Ann. Intl. Symp. on Computer Architecture, Gold Coast, Australia, pp. 13-21 (May 19–21, 1992).

  27. K. L. Johnson, M. F. Kaashoek, and D. Wallach, CRL: High-Performance All-Software Distributed Shared Memory, Proc. of the Fifth Workshop on Scalable Shared Memory Multiprocessors (June 1995).

  28. K. Thitikamol and P. Keleher, Multi-Threading and Remote Latency in Software dsms, International Conference on Distributed Computer Systems, Baltimore, Maryland, USA (May 1997).

  29. D. A. Bader and J. JaJa, SIMPLE: A Methodology for Programming High Performance Algorithms on Clusters of Symmetric Multiprocessors (SMPs), Journal of Parallel and Distributed Computing, 58(1):92-108 (1999).

    Google Scholar 

  30. H. Han and C-W. Tseng, Compile-Time Synchronization Optimizations for Software DSMs, Proc. of the International Parallel Processing Symposium (1998).

  31. T. Park and H. Y. Yeom, An Efficient Logging and Recovery Scheme for Lazy Release Consistent Distributed Shared Memory Systems, Future Generation Computer Systems, 17(3):265-278 (2000).

    Google Scholar 

  32. Stanford SUIF Compiler Group, The SUIF Parallelizing Compiler Group, Technical report, Stanford University (1994).

  33. T. Agerwala, J. L. Martin, J. H. Mirza, D. C. Sadler, D. M. Dias, and M. Snir, Sp2 System Architecture, IBM Systems Journal, 34(2):152-184 (1995).

    Google Scholar 

  34. C. B. Stunkel, D. G. Shea, B. Abali, M. G. Atkins, C. A. Bender, D. G. Grice, P. Hochschild, D. J. Joseph, B. J. Nathanson, R. A. Swetz, R. F. Stucke, M. Tsao, and P. R. Varker, The sp2 High-Performance Switch, IBM Systems Journal, 34(2):185-204 (1995).

    Google Scholar 

  35. Stanford SUIF Compiler Group, An Overview of the Suif Compiler System, Technical report, Stanford University (1994).

  36. H. Lu, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel, Compiler and Software Distributed Shared Memory Support for Irregular Applications, Sixth Symposium on Principles and Practices of Parallel Programming, pp. 48-56 (1997).

  37. G. E. Blelloch, C. E. Lieserson, B. M. Maggs, C. G. Plaxton, and S. J. Smith, and M. Zagha, A Comparison of Sorting Algorithms for the Connection Machine CM-2, Symposium on Parallel Algorithms and Architectures, pp. 3-16 (July 1991).

  38. S. C. Woo, J. P. Singh, and J. L. Hennessy, The Performance Advantages of Integrating Message Passing in Cache-Coherent Multiprocessors, Technical report, Stanford University (December 1993).

  39. D. Perkovic and P. Keleher, Responsiveness without Interrupts, Proc. of the 1999 Intl. Conf. on Supercomputing (ICS-99), pp. 101-108 (June 1999).

  40. I. Schoinas, B. Falsafi, A. R. Lebeck, S. K. Reinhardt, J. R. Larus, and D. A. Wood, Fine-Grain Access Control for Distributed Shared Memory, Proceedings of Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, pp. 297-306 (October 1996).

  41. D. J. Scales and M. S. Lam, An Efficient Memory Layer for Distributed Memory Machines, CSL-TR-94-627, Computer Systems Laboratory, Stanford University (1994).

  42. S. Ahuja, N. Carriero, and D. Gelernter, Linda and Friends, Computer, 19(8):26-34 (August 1986).

    Google Scholar 

  43. D. Yeung, J. Kubiatowitcz, and A. Agarwal, MGS: A Multigrain Shared Memory System, Proceedings of the Twenty-Third International Symposium on Computer Architecture, pp. 44-55 (May 1996).

  44. J. Protić, M. Tomašević, and V. Milutinović, Distributed Shared Memory: Concepts and Systems, IEEE Parallel and Distributed Technology, 63-79 (1996).

  45. S. Dwarkdas, A. L. Cox, and W. Zwaenepoel, An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System, Proc. of Sixth Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 186-197, San Jose, CA (October 1996).

  46. T. Mowry and A. Gupta, Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors, Journal of Parallel and Distributed Computing 12:87-106 (1991).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Manoj, N.P., Manjunath, K.V. & Govindarajan, R. CAS-DSM: A Compiler Assisted Software Distributed Shared Memory. International Journal of Parallel Programming 32, 77–122 (2004). https://doi.org/10.1023/B:IJPP.0000023480.82632.87

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:IJPP.0000023480.82632.87

Navigation