skip to main content
10.1145/3597031.3597055acmotherconferencesArticle/Chapter ViewAbstractPublication PagesheartConference Proceedingsconference-collections
research-article

Exploration of Compute vs. Interconnect Tradeoffs in CGRAs for HPC

Published:19 July 2023Publication History

ABSTRACT

We consider the balance between compute density and interconnect in Coarse-Grained Reconfigurable Architectures (CGRAs) intended for acceleration of HPC applications. We model a baseline CGRA architecture [2] in the open-source CGRA-ME framework [11] and describe the modelling as a case study. Then, holding the interconnect fabric constant, we create several variants of the baseline CGRA: 1) one having reduced (sparser) compute capability where not all ALUs are fully capable, 2) one having increased (denser) compute capability, where the amount of compute is roughly doubled relative to the baseline, and 3) one with increased I/O bandwidth. In an experimental study, we evaluate all architectures to assess application mappability and resource usage for a set of benchmark applications. We also evaluate silicon area consumption using a standard-cell ASIC flow. Results show the baseline CGRA to be overprovisioned in both compute and interconnect, with the proposed variants offering superior area efficiency.

References

  1. 2023. The NanGate FreePDK45 Open Cell Library.Google ScholarGoogle Scholar
  2. Boma Adhi, Carlos Cortes, Yiyu Tan, Takuya Kojima, Artur Podobas, and Kentaro Sano. 2022. The Cost of Flexibility: Embedded versus Discrete Routers in CGRAs for HPC. In IEEE CLUSTER.Google ScholarGoogle Scholar
  3. Boma Adhi, Carlos Cortes, Yiyu Tan, Takuya Kojima, Artur Podobas, and Kentaro Sano. 2022. Exploration Framework for Synthesizable CGRAs Targeting HPC: Initial Design and Evaluations. In The First International Workshop on Coarse-Grained Reconfigurable Architectures for High-Performance Computing (CGRA4HPC).Google ScholarGoogle Scholar
  4. Boma Adhi, Carlos Cortes, Tomohiro Ueno, Yiyu Tan, Takuya Kojima, Artur Podobas, and Kentaro Sano. 2022. Exploring Inter-tile Connectivity for HPC-oriented CGRA with Lower Resource Usage. In IEEE FPT.Google ScholarGoogle Scholar
  5. Giovanni Ansaloni, Paolo Bonzini, and Laura Pozzi. 2011. EGRA: A Coarse Grained Reconfigurable Architectural Template. IEEE TVLSI 19, 6 (2011), 1062–1074.Google ScholarGoogle Scholar
  6. Oguzhan Atak and Abdullah Atalar. 2012. BilRC: An execution triggered coarse grained reconfigurable architecture. IEEE TVLSI 21, 7 (2012), 1285–1298.Google ScholarGoogle Scholar
  7. Thilini Kaushalya Bandara, Dhananjaya Wijerathne, Tulika Mitra, and Li-Shiuan Peh. 2022. REVAMP: A systematic framework for heterogeneous CGRA realization. In ACM ASPLOS. 918–932.Google ScholarGoogle Scholar
  8. Volker Baumgarte, Gerd Ehlers, Frank May, Armin Nückel, Martin Vorbach, and Markus Weinhardt. 2003. PACT XPP – A Self-Reconfigurable Data Processing Architecture. Journal of Supercomputing 26, 2 (2003), 167–184.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Frank Bouwens, Mladen Berekovic, Andreas Kanstein, and Georgi Gaydadjiev. 2007. Architectural exploration of the ADRES coarse-grained reconfigurable array. In International Workshop on Applied Reconfigurable Computing. Springer, 1–13.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. Alexander Chin and Jason H. Anderson. 2018. An Architecture-Agnostic Integer Linear Programming Approach to CGRA Mapping. In IEEE/ACM DAC.Google ScholarGoogle Scholar
  11. S. Alexander Chin, Noriaki Sakamoto, Allan Rui, Jim Zhao, Jin Hee Kim, Yuko Hara-Azumi, and Jason Anderson. 2017. CGRA-ME: A unified framework for CGRA modelling and exploration. In IEEE ASAP. 184–189.Google ScholarGoogle Scholar
  12. Florent de Dinechin and Bogdan Pasca. 2011. Designing Custom Arithmetic Data Paths with FloPoCo. IEEE Design & Test of Computers 28, 4 (July 2011), 18–27.Google ScholarGoogle Scholar
  13. Jens Domke, Kazuaki Matsumura, Mohamed Wahib, Haoyu Zhang, Keita Yashima, Toshiki Tsuchikawa, Yohei Tsuji, Artur Podobas, and Satoshi Matsuoka. 2019. Double-precision FPUs in high-performance computing: an embarrassment of riches?. In IEEE IPDPS. 78–88.Google ScholarGoogle Scholar
  14. Carl Ebeling, Darren C Cronquist, and Paul Franklin. 1996. RaPiD – Reconfigurable pipelined datapath. In FPL. 126–135.Google ScholarGoogle Scholar
  15. Graham Gobieski, Ahmet Oguz Atli, Kenneth Mai, Brandon Lucia, and Nathan Beckmann. 2021. Snafu: an ultra-low-power, energy-minimal CGRA-generation framework and architecture. In ACM/IEEE ISCA. 1027–1040.Google ScholarGoogle Scholar
  16. Seth Copen Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matthew Moe, and R Reed Taylor. 2000. PipeRench: A reconfigurable architecture and compiler. Computer 33, 4 (2000), 70–77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ujval J Kapasi, William J Dally, Scott Rixner, John D Owens, and Brucek Khailany. 2002. The Imagine stream processor. In IEEE ICCD. 282–288.Google ScholarGoogle Scholar
  18. Manupa Karunaratne, Aditi Kulkarni Mohite, Tulika Mitra, and Li-Shiuan Peh. 2017. HyCUBE: A CGRA with reconfigurable single-cycle multi-hop interconnect. In IEEE/ACM DAC.Google ScholarGoogle Scholar
  19. Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. 1983. Optimization by simulated annealing. Science 220, 4598 (1983), 671–680.Google ScholarGoogle Scholar
  20. Guangming Lu, Hartej Singh, Ming-Hau Lee, Nader Bagherzadeh, Fadi Kurdahi, 1999. The MorphoSys parallel reconfigurable system. In European Conference on Parallel Processing. Springer, 727–734.Google ScholarGoogle ScholarCross RefCross Ref
  21. L. McMurchie and C. Ebeling. 1995. PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs. In ACM FPGA. 111–7. https://doi.org/10.1109/FPGA.1995.242049Google ScholarGoogle ScholarCross RefCross Ref
  22. Bingfeng Mei, Serge Vernalde, Diederik Verkest, Hugo De Man, and Rudy Lauwereins. 2003. ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In FPL. 61–70.Google ScholarGoogle Scholar
  23. Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-dataflow acceleration. In IEEE/ACM ISCA. 416–429.Google ScholarGoogle Scholar
  24. Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. 2017. Plasticine: A reconfigurable architecture for parallel patterns. In ACM/IEEE ISCA. 389–402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Rohit Prasad, Satyajit Das, Kevin JM Martin, Giuseppe Tagliavini, Philippe Coussy, Luca Benini, and Davide Rossi. 2020. TRANSPIRE: An energy-efficient TRANSprecision floating-point Programmable archItectuRE. In IEEE/ACM DATE. IEEE, 1067–1072.Google ScholarGoogle Scholar
  26. Omar Ragheb, Rami Beidas, and Jason Anderson. 2023. Statically Scheduled vs. Elastic CGRA Architectures: Impact on Mapping Feasibility. In Second International Workshop on CGRAs for HPC (CGRA4HPC).Google ScholarGoogle Scholar
  27. Omar Ragheb, Tianyi Yu, and Jason Anderson. 2022. Modelling and exploration of elastic CGRAs. In FPL.Google ScholarGoogle Scholar
  28. Matthew J. P. Walker and Jason H. Anderson. 2019. Generic Connectivity-Based CGRA Mapping via Integer Linear Programming. In IEEE FCCM. 65–73. https://doi.org/10.1109/FCCM.2019.00019Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Exploration of Compute vs. Interconnect Tradeoffs in CGRAs for HPC

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          HEART '23: Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
          June 2023
          127 pages
          ISBN:9798400700439
          DOI:10.1145/3597031

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate22of50submissions,44%
        • Article Metrics

          • Downloads (Last 12 months)147
          • Downloads (Last 6 weeks)6

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format