Dynamic Memory Management in Vivado-HLS for Scalable Many-Accelerator Architectures

Diamantopoulos, Dionysios; Xydis, S.; Siozios, K.; Soudris, D.

doi:10.1007/978-3-319-16214-0_10

Dionysios Diamantopoulos¹⁷,
S. Xydis¹⁷,
K. Siozios¹⁷ &
…
D. Soudris¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9040))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

4133 Accesses
8 Citations

Abstract

This paper discusses the incorporation of dynamic memory management during High-Level-Synthesis (HLS) for effective resource utilization in many-accelerator architectures targeting to FPGA devices. We show that in today’s FPGA devices, the main limiting factor of scaling the number of accelerators is the starvation of the available on-chip memory. For many-accelerator architectures, this leads in severe inefficiencies, i.e. memory-induced resource under-utilization of the rest of the FPGA’s resources. Recognizing that static memory allocation – the de-facto mechanism supported by modern design techniques and synthesis tools – forms the main source of “resource under-utilization” problems, we introduce the DMM-HLS framework that extends conventional HLS with dynamic memory allocation/deallocation mechanisms to be incorporated during many-accelerator synthesis. We integrated the proposed framework with the industrial strength Vivado-HLS tool, and we evaluate its effectiveness with a set of key accelerators from emerging application domains. DMM-HLS delivers significant increase in FPGA’s accelerators density (3.8\(\times \) more accelerators) in exchange for affordable overheads in terms of delay and resource count.

This work was partially supported by “TEAChER: TEach AdvanCEd Reconfigurable architectures and tools” project funded by DAAD (2014) and CIDCIP and MENELAOS projects funded by the Greek Ministry of Development under the National Strategic Reference Framework NSRF 2007-2013, action “Creation of innovation clusters” “A GREEK PRODUCT, A SINGLE MARKET: THE PLANET”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Flynn, M.J., Mencer, O., Milutinovic, V., Rakocevic, G., Stenstrom, P., Trobec, R., Valero, M.: Moving from petaflops to petadata. Commun. ACM 56(5), 39–42 (2013)
Article Google Scholar
Shalf, J., Quinlan, D., Janssen, C.: Rethinking hardware-software codesign for exascale systems. Computer 44(11), 22–30 (2011)
Article Google Scholar
Venkatesh, G., Sampson, J., Goulding, N., Garcia, S., Bryksin, V., Lugo-Martinez, J., Swanson, S., Taylor, M.B.: Conservation cores: Reducing the energy of mature computations. SIGARCH Comput. Archit. News 38(1), 205–218 (2010)
Article Google Scholar
Chen, Y.-T., Cong, J., Ghodrat, M., Huang, M., Liu, C., Xiao, B., Zou, Y.: Accelerator-rich cmps: From concept to real hardware. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 169–176. October 2013
Google Scholar
Cong, J., Ghodrat, M.A., Gill, M., Grigorian, B., Reinman, G.: Architecture support for domain-specific accelerator-rich cmps. ACM Trans. Embed. Comput. Syst. 13(4s), 131:1–131:26 (2014)
Article Google Scholar
Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for fpgas: From prototyping to deployment. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 30(4), 473–491 (2011)
Article Google Scholar
Lyons, M.J., Hempstead, M., Wei, G.-Y., Brooks, D.: The accelerator store: A shared memory framework for accelerator-based systems. ACM Trans. Archit. Code Optim. 8(4), 48:1–48:22 (2012)
Article Google Scholar
Cota, E., Mantovani, P., Petracca, M., Casu, M., Carloni, L.: Accelerator memory reuse in the dark silicon era. IEEE Computer Architecture Letters 99, no. RapidPosts, p. 1 (2012)
Google Scholar
Semeria, L., De Micheli, G.: Spc: synthesis of pointers in c application of pointer analysis to the behavioral synthesis from c. In: ICCAD 98. Digest of Technical Papers. 1998 IEEE/ACM International Conference on Computer-Aided Design, pp. 340–346 November 1998
Google Scholar
Shalan, M., Mooney, V.J.: A dynamic memory management unit for embedded real-time system-on-a-chip. In: Proceedings of the 2000 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, ser. CASES 2000. ACM, New York, NY, USA, pp. 180–186 (2000)
Google Scholar
Xilinx, Inc. [Online]. (http://www.xilinx.com)
Xydis, S., Bartzas, A., Anagnostopoulos, I., Soudris, D., Pekmestzi, K.Z.: Custom multi-threaded dynamic memory management for multiprocessor system-on-chip platforms. In: ICSAMOS, pp. 102–109 (2010)
Google Scholar
Sade, Y., Sagiv, M., Shaham, R.: Optimizing c multithreaded memory management using thread-local storage. In: Bodik, R. (ed.) CC 2005. LNCS, vol. 3443, pp. 137–155. Springer, Heidelberg (2005)
Chapter Google Scholar
Putnam, A., Caulfield, A., Chung, E., Chiou, D., Constantinides, K., Demme, J., Esmaeilzadeh, H., Fowers, J., Gopal, G.P., Gray, J., Haselman, M., Hauck, S., Heil, S., Hormati, A., Kim, J.-Y., Lanka, S., Larus, J., Peterson, E., Pope, S., Smith, A., Thong, J., Xiao, P.Y., Burger, D.: A reconfigurable fabric for accelerating large-scale datacenter services. In: 41st Annual International Symposium on Computer Architecture (ISCA) June 2014
Google Scholar
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: IEEE 13th International Symposium on, High Performance Computer Architecture, HPCA 2007, pp. 13–24 February 2007
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, National Technical University of Athens, Athina, Greece
Dionysios Diamantopoulos, S. Xydis, K. Siozios & D. Soudris

Authors

Dionysios Diamantopoulos
View author publications
You can also search for this author in PubMed Google Scholar
S. Xydis
View author publications
You can also search for this author in PubMed Google Scholar
K. Siozios
View author publications
You can also search for this author in PubMed Google Scholar
D. Soudris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dionysios Diamantopoulos .

Editor information

Editors and Affiliations

Tohoku University, Sendai, Japan
Kentaro Sano
National Technical University of Athens, Athens, Greece
Dimitrios Soudris
Ruhr-Universität Bochum, Bochum, Germany
Michael Hübner
University of Southern California, Marina del Rey, California, USA
Pedro C. Diniz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diamantopoulos, D., Xydis, S., Siozios, K., Soudris, D. (2015). Dynamic Memory Management in Vivado-HLS for Scalable Many-Accelerator Architectures. In: Sano, K., Soudris, D., Hübner, M., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science(), vol 9040. Springer, Cham. https://doi.org/10.1007/978-3-319-16214-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-16214-0_10
Published: 31 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16213-3
Online ISBN: 978-3-319-16214-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics