Skip to main content

Dynamic Memory Management in Vivado-HLS for Scalable Many-Accelerator Architectures

  • Conference paper
  • First Online:
Applied Reconfigurable Computing (ARC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9040))

Included in the following conference series:

Abstract

This paper discusses the incorporation of dynamic memory management during High-Level-Synthesis (HLS) for effective resource utilization in many-accelerator architectures targeting to FPGA devices. We show that in today’s FPGA devices, the main limiting factor of scaling the number of accelerators is the starvation of the available on-chip memory. For many-accelerator architectures, this leads in severe inefficiencies, i.e. memory-induced resource under-utilization of the rest of the FPGA’s resources. Recognizing that static memory allocation – the de-facto mechanism supported by modern design techniques and synthesis tools – forms the main source of “resource under-utilization” problems, we introduce the DMM-HLS framework that extends conventional HLS with dynamic memory allocation/deallocation mechanisms to be incorporated during many-accelerator synthesis. We integrated the proposed framework with the industrial strength Vivado-HLS tool, and we evaluate its effectiveness with a set of key accelerators from emerging application domains. DMM-HLS delivers significant increase in FPGA’s accelerators density (3.8\(\times \) more accelerators) in exchange for affordable overheads in terms of delay and resource count.

This work was partially supported by “TEAChER: TEach AdvanCEd Reconfigurable architectures and tools” project funded by DAAD (2014) and CIDCIP and MENELAOS projects funded by the Greek Ministry of Development under the National Strategic Reference Framework NSRF 2007-2013, action “Creation of innovation clusters” “A GREEK PRODUCT, A SINGLE MARKET: THE PLANET”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Flynn, M.J., Mencer, O., Milutinovic, V., Rakocevic, G., Stenstrom, P., Trobec, R., Valero, M.: Moving from petaflops to petadata. Commun. ACM 56(5), 39–42 (2013)

    Article  Google Scholar 

  2. Shalf, J., Quinlan, D., Janssen, C.: Rethinking hardware-software codesign for exascale systems. Computer 44(11), 22–30 (2011)

    Article  Google Scholar 

  3. Venkatesh, G., Sampson, J., Goulding, N., Garcia, S., Bryksin, V., Lugo-Martinez, J., Swanson, S., Taylor, M.B.: Conservation cores: Reducing the energy of mature computations. SIGARCH Comput. Archit. News 38(1), 205–218 (2010)

    Article  Google Scholar 

  4. Chen, Y.-T., Cong, J., Ghodrat, M., Huang, M., Liu, C., Xiao, B., Zou, Y.: Accelerator-rich cmps: From concept to real hardware. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 169–176. October 2013

    Google Scholar 

  5. Cong, J., Ghodrat, M.A., Gill, M., Grigorian, B., Reinman, G.: Architecture support for domain-specific accelerator-rich cmps. ACM Trans. Embed. Comput. Syst. 13(4s), 131:1–131:26 (2014)

    Article  Google Scholar 

  6. Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for fpgas: From prototyping to deployment. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 30(4), 473–491 (2011)

    Article  Google Scholar 

  7. Lyons, M.J., Hempstead, M., Wei, G.-Y., Brooks, D.: The accelerator store: A shared memory framework for accelerator-based systems. ACM Trans. Archit. Code Optim. 8(4), 48:1–48:22 (2012)

    Article  Google Scholar 

  8. Cota, E., Mantovani, P., Petracca, M., Casu, M., Carloni, L.: Accelerator memory reuse in the dark silicon era. IEEE Computer Architecture Letters 99, no. RapidPosts, p. 1 (2012)

    Google Scholar 

  9. Semeria, L., De Micheli, G.: Spc: synthesis of pointers in c application of pointer analysis to the behavioral synthesis from c. In: ICCAD 98. Digest of Technical Papers. 1998 IEEE/ACM International Conference on Computer-Aided Design, pp. 340–346 November 1998

    Google Scholar 

  10. Shalan, M., Mooney, V.J.: A dynamic memory management unit for embedded real-time system-on-a-chip. In: Proceedings of the 2000 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, ser. CASES 2000. ACM, New York, NY, USA, pp. 180–186 (2000)

    Google Scholar 

  11. Xilinx, Inc. [Online]. (http://www.xilinx.com)

  12. Xydis, S., Bartzas, A., Anagnostopoulos, I., Soudris, D., Pekmestzi, K.Z.: Custom multi-threaded dynamic memory management for multiprocessor system-on-chip platforms. In: ICSAMOS, pp. 102–109 (2010)

    Google Scholar 

  13. Sade, Y., Sagiv, M., Shaham, R.: Optimizing c multithreaded memory management using thread-local storage. In: Bodik, R. (ed.) CC 2005. LNCS, vol. 3443, pp. 137–155. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Putnam, A., Caulfield, A., Chung, E., Chiou, D., Constantinides, K., Demme, J., Esmaeilzadeh, H., Fowers, J., Gopal, G.P., Gray, J., Haselman, M., Hauck, S., Heil, S., Hormati, A., Kim, J.-Y., Lanka, S., Larus, J., Peterson, E., Pope, S., Smith, A., Thong, J., Xiao, P.Y., Burger, D.: A reconfigurable fabric for accelerating large-scale datacenter services. In: 41st Annual International Symposium on Computer Architecture (ISCA) June 2014

    Google Scholar 

  15. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: IEEE 13th International Symposium on, High Performance Computer Architecture, HPCA 2007, pp. 13–24 February 2007

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dionysios Diamantopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Diamantopoulos, D., Xydis, S., Siozios, K., Soudris, D. (2015). Dynamic Memory Management in Vivado-HLS for Scalable Many-Accelerator Architectures. In: Sano, K., Soudris, D., Hübner, M., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science(), vol 9040. Springer, Cham. https://doi.org/10.1007/978-3-319-16214-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16214-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16213-3

  • Online ISBN: 978-3-319-16214-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics