skip to main content
10.1145/3453688.3461494acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

Re2PIM: A Reconfigurable ReRAM-Based PIM Design for Variable-Sized Vector-Matrix Multiplication

Authors Info & Claims
Published:22 June 2021Publication History

ABSTRACT

ReRAM-based deep neural network (DNN) accelerator shows enormous potential because of ReRAM's high computational-density and power-efficiency. A typical feature of DNNs is that weight matrix size varies across diverse DNNs and DNN layers. However, current ReRAM-based DNN accelerators adopt a fixed-sized compute unit (CU) design, resulting in a dilemma of trading off between throughput and energy-efficiency: when computing large vector-matrix multiplication with small CUs, the overhead of the peripheral circuits is relatively high; when computing small vector-matrix multiplication with large CUs, the low utilization of ReRAM crossbars damages the throughput. In this work, we propose Re2PIM, a reconfigurable ReRAM-based DNN accelerator. Each tile of Re2PIM is composed of reconfigurable units (RUs), which can be reconfigured as vector-vatrix multiplier (VMM), digital-to-analog converter (DAC), or analog shift-and-add (AS+A). We can reconfigure RUs and obtain CUs of various sizes according to the DNN's weight matrices. It hence assures a high energy-efficiency without damaging throughput given various DNN benchmarks. Evaluations on different DNN benchmarks show that Re2PIM can achieve 27×/34×/1.5× and 5.7×/17×/8.2× improvement in energy efficiency and computational throughput respectively compared to the state-of-art accelerators (PRIME / ISAAC / TIMELY).

Skip Supplemental Material Section

Supplemental Material

GLSVLSI21-glsv056.mp4

mp4

118.9 MB

References

  1. Rajeev Balasubramonian and et al. 2017. CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories. ACM Trans. Archit. Code Optim., Vol. 14, 2, Article 14 (June 2017), 25 pages. https://doi.org/10.1145/3085572Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. W. Cao and et al. 2019. Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices. In ICCAD. 1--7. https://doi.org/10.1109/ICCAD45719.2019.8942099Google ScholarGoogle Scholar
  3. L. Chen and et al. 2017. Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar. In DATE. 19--24. https://doi.org/10.23919/DATE.2017.7926952Google ScholarGoogle Scholar
  4. P. Chen and et al. 2015. Compact Modeling of RRAM Devices and Its Applications in 1T1R and 1S1R Array Design. IEEE Transactions on Electron Devices, Vol. 62, 12 (2015), 4022--4028. https://doi.org/10.1109/TED.2015.2492421Google ScholarGoogle ScholarCross RefCross Ref
  5. P. Chi and et al. 2016. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory. In ISCA. 27--39. https://doi.org/10.1109/ISCA.2016.13Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Teyuh Chou and et al. 2019. CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm. In MICRO '52 (Columbus, OH, USA). 114--125. https://doi.org/10.1145/3352460.3358328Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jacob Devlin and et al. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv: 1810.04805 [cs.CL]Google ScholarGoogle Scholar
  8. Gerald Gamrath and et al. 2020. The SCIP Optimization Suite 7.0. Technical Report. Optimization Online. http://www.optimization-online.org/DB_HTML/2020/03/7705.htmlGoogle ScholarGoogle Scholar
  9. Kaiming He and et al. 2015. Deep Residual Learning for Image Recognition. arxiv: 1512.03385 [cs.CV]Google ScholarGoogle Scholar
  10. K. He and et al. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In ICCV. 1026--1034. https://doi.org/10.1109/ICCV.2015.123Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zhezhi He and et al. 2019. Noise Injection Adaption: End-to-End ReRAM Crossbar Non-Ideal Effect Adaption for Neural Network Mapping (DAC '19). ACM, New York, NY, USA, Article 57, 6 pages. https://doi.org/10.1145/3316781.3317870Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andrew G. Howard and et al. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv: 1704.04861 [cs.CV]Google ScholarGoogle Scholar
  13. A. Karpathy and et al. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR. 3128--3137. https://doi.org/10.1109/CVPR.2015.7298932Google ScholarGoogle Scholar
  14. Alex Krizhevsky and et al. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM, Vol. 60, 6 (May 2017), 84--90. https://doi.org/10.1145/3065386Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Li and et al. 2020. Timely: Pushing Data Movements And Interfaces In Pim Accelerators Towards Local And In Time Domain. In ISCA. 832--845. https://doi.org/10.1109/ISCA45697.2020.00073Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. O'Halloran and et al. 2004. A 10-nW 12-bit accurate analog storage cell with 10-aA leakage. IEEE Journal of Solid-State Circuits, Vol. 39, 11 (2004), 1985--1996. https://doi.org/10.1109/JSSC.2004.835817Google ScholarGoogle ScholarCross RefCross Ref
  17. Adam Paszke and et al. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. 8024--8035.Google ScholarGoogle Scholar
  18. Fabrice Salvaire and et al. [n.d.]. PySpice. https://pyspice.fabrice-salvaire.frGoogle ScholarGoogle Scholar
  19. A. Shafiee and et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In ISCA. 14--26. https://doi.org/10.1109/ISCA.2016.12Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Silberman and et al. 2016. TensorFlow-Slim image classification model library. https://github.com/tensorflow/models/tree/master/research/slimGoogle ScholarGoogle Scholar
  21. Karen Simonyan and et al. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arxiv: 1409.1556 [cs.CV]Google ScholarGoogle Scholar
  22. M. Zhao and et al. 2018. Characterizing Endurance Degradation of Incremental Switching in Analog RRAM for Neuromorphic Systems. In IEDM. 20.2.1-20.2.4. https://doi.org/10.1109/IEDM.2018.8614664Google ScholarGoogle Scholar

Index Terms

  1. Re2PIM: A Reconfigurable ReRAM-Based PIM Design for Variable-Sized Vector-Matrix Multiplication

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI
          June 2021
          504 pages
          ISBN:9781450383936
          DOI:10.1145/3453688

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 June 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate312of1,156submissions,27%

          Upcoming Conference

          GLSVLSI '24
          Great Lakes Symposium on VLSI 2024
          June 12 - 14, 2024
          Clearwater , FL , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader