Abstract
The widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size fast memory (cache) of size M, and an unbounded slow memory organized in blocks of size B. The cost measure is based purely on the number of block transfers between the primary and secondary memory. All other operations are free. Many algorithms have been analyzed in these models and indeed these models predict the relative performance of algorithms much more accurately than the standard Random Access Machine (RAM) model. The models, however, require specifying algorithms at a very low level, requiring the user to carefully lay out their data in arrays in memory and manage their own memory allocation.
We present a cost model for analyzing the memory efficiency of algorithms expressed in a simple functional language. We show how some algorithms written in standard forms using just lists and trees (no arrays) and requiring no explicit memory layout or memory management are efficient in the model. We then describe an implementation of the language and show provable bounds for mapping the cost in our model to the cost in the ideal-cache model. These bounds imply that purely functional programs based on lists and trees with no special attention to any details of memory layout can be asymptotically as efficient as the carefully designed imperative I/O efficient algorithms. For example we describe an o(n/BlogM/Bn/B) cost sorting algorithm, which is optimal in the ideal cache and I/O models.<!-- END_PAGE_1 -->
- Abello, J., Buchsbaum, A.L., Westbrook, J. A functional approach to external graph algorithms. Algorithmica 32, 3 (2002), 437--458.Google Scholar
Digital Library
- Aggarwal, A., Vitter, J.S. The input/output complexity of sorting and related problems. Commun. ACM 31, 9 (1988), 1116--1127. Google Scholar
Digital Library
- Arge, L., Bender, M.A., Demaine, E.D., Leiserson, C.E., Mehlhorn, K., eds. Cache-Oblivious and Cache-Aware Algorithms, 18.07.--23.07.2004, Volume 04301 of Dagstuhl Seminar Proceedings. IBFI, Schloss Dagstuhl, Germany, 2005.Google Scholar
- Blelloch, G.E., Greiner, J. Parallelism in sequential functional languages. In SIGPLAN-SIGARCH-WG2.8 Conference on Functional Programming and Computer Architecture (FPCA) (La Jolla, CA, 1995), 226--237. Google Scholar
Digital Library
- Blelloch, G.E., Harper, R. Cache and I/O efficent functional algorithms. In ACM-SIAM Symposium on Discrete Algorithms (SODA). R. Giacobazzi and R. Cousot, eds, (Rome, Italy, 2013), ACM, 39--50.Google Scholar
- Chiang, Y.-J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroff, D.E., Vitter, J.S. External-memory graph algorithms. In ACM-SIAM Symposium on Discrete Algorithms (SODA). K.L. Clarkson, ed. (San Francisco, CA, 1995), ACM/SIAM, 139--149. Google Scholar
Digital Library
- Chilimbi, T.M., Larus, J.R. Using generational garbage collection to implement cache-conscious data placement. In International Symposium on Memory Management. S.L.P. Jones and R.E. Jones, eds. (Vancouver, British Columbia, 1998), ACM, 37--48. Google Scholar
Digital Library
- Church, A. An unsolvable problem of elementary number theory. Am. J. Math. 58, 2 (April 1936), 345--363.Google Scholar
Cross Ref
- Church, A. The Calculi of Lambda-Conversion. Annals of Mathematics Studies. Princeton University Press, Princeton, NJ, 1941.Google Scholar
- Courts, R. Improving locality of reference in a garbage-collecting memory management system. Commun. ACM 31, 9 (1988), 1128--1138. Google Scholar
Digital Library
- Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S. Cache-oblivious algorithms. In FOCS (IEEE Computer Society, 1999), 285--298. Google Scholar
Digital Library
- Goodrich, M.T., Tsay, J.-J., Vengroff, D.E., Vitter, J.S. External-memory computational geometry (preliminary version). In FOCS (IEEE Computer Society, 1993), 714--723. Google Scholar
Digital Library
- Greiner, J., Blelloch, G.E. A provably time-efficient parallel implementation of full speculation. ACM Trans. Program. Lang. Syst. 21, 2 (1999), 240--285. Google Scholar
Digital Library
- Grunwald, D., Zorn, B.G., Henderson, R. Improving the cache locality of memory allocation. In R. Cartwright, ed., PLDI (ACM, 1993), 177--186. Google Scholar
Digital Library
- Harper, R. Practical Foundations for Programming Languages. Cambridge University Press, Cambridge, UK, 2013. Google Scholar
Digital Library
- Jones, R., Lins, R. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, 1996. Google Scholar
Digital Library
- Meyer, U., Sanders, P., Sibeyn, J.F., eds. Algorithms for Memory Hierarchies, Advanced Lectures {Dagstuhl Research Seminar, March 10--14, 2002}, volume 2625 of Lecture Notes in Computer Science. (Schloss Dagstuhl, Germany, 2003), Springer. Google Scholar
Digital Library
- Morrisett, J.G., Felleisen, M., Harper, R. Abstract models of memory management. In FPCA (1995), 66--77. Google Scholar
Digital Library
- Munagala, K., Ranade, A.G. I/O-complexity of graph algorithms. In SODA. R.E. Tarjan and T. Warnow, eds. (ACM/SIAM, 1999), 687--694. Google Scholar
Digital Library
- Plotkin, G.D. LCF considered as a programming language. Theor. Comput. Sci. 5, 3 (1977), 223--255.Google Scholar
Cross Ref
- Rahn, M., Sanders, P., Singler, J. Scalable distributed-memory external sorting. In ICDE. F. Li, M.M. Moro, S. Ghandeharizadeh, J.R. Haritsa, G. Weikum, M.J. Carey, F. Casati, E.Y. Chang, I. Manolescu, S. Mehrotra, U. Dayal, and V.J. Tsotras, eds. (IEEE, 2010), 685--688.Google Scholar
Cross Ref
- Spoonhower, D., Blelloch, G.E., Harper, R., Gibbons, P.B. Space profiling for parallel functional programs. In ICFP. J. Hook and P. Thiemann, eds. (ACM, 2008), 253--264. Google Scholar
Digital Library
- Vitter, J.S. Algorithms and data structures for external memory. Foundations Trends Theor. Comput. Sci. 2, 4 (2006), 305--474. Google Scholar
Digital Library
- Wilson, P.R., Lam, M.S., Moher, T.G. Caching considerations for generational garbage collection. In LISP and Functional Programming, 1992, 32--42. Google Scholar
Digital Library
Index Terms
- Cache efficient functional algorithms
Recommendations
Cache and I/O efficent functional algorithms
POPL '13The widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size primary ...
Cache and I/O efficent functional algorithms
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThe widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size primary ...
Efficient STT-RAM last-level-cache architecture to replace DRAM cache
MEMSYS '17: Proceedings of the International Symposium on Memory SystemsRecent research has proposed die-stacked Last Level Cache (LLC) to overcome the Memory Wall. Lately, Spin-Transfer-Torque Random Access Memory (STT-RAM) caches have been recommended as they provide improved energy efficiency compared to DRAM caches. ...
Comments