skip to main content
10.1145/3240765.3240767guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs

Published:05 November 2018Publication History

ABSTRACT

It is extremely challenging to deploy computing-intensive convolutional neural networks (CNNs) with rich parameters in mobile devices because of their limited computing resources and low power budgets. Although prior works build fast and energy-efficient CNN accelerators by greatly sacrificing test accuracy, mobile devices have to guarantee high CNN test accuracy for critical applications, e.g., unlocking phones by face recognitions. In this paper, we propose a 3D XPoint ReRAM-based process-in-memory architecture, 3DICT, to provide various test accuracies to applications with different priorities by lookup-based CNN tests that dynamically exploit the trade-off between test accuracy and latency. Compared to the state-of-the-art accelerators, on average, 3DICT improves the CNN test performance per Watt by 13% ∼ 61× and guarantees 9-year endurance under various CNN test accuracy requirements.

References

  1. [1].Andri R. et al., “YodaNN: An Ultra-Low Power CNN Accelerator Based on Binary Weights;” in ISVLSI, pages 236241, July 2016.Google ScholarGoogle Scholar
  2. [2].Bagherinezhad H. et al., “LCNN: Lookup-based Convolutional Neural Network,” in CVPR, 2017.Google ScholarGoogle Scholar
  3. [3].Chakrabarti B. et al., “A multiply-add engine with monolithically integrated 3D memristor crossbar/CMOS hybrid circuit,” Scientific Reports, 7, 2017.Google ScholarGoogle Scholar
  4. [4].Chen Y.H. et al., “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” in ISSCC, 2016.Google ScholarGoogle Scholar
  5. [5].Chi P. et al., “PRIME: A Novel PIM Architecture for Neural Network Computation in ReRAM-Based Main Memory;” in ISCA, 2016.Google ScholarGoogle Scholar
  6. [6].Nachiappan N. Chidambaram, et al., “GemDroid: A Framework to Evaluate Mobile Platforms,” in SIGMETRICS, 2014.Google ScholarGoogle Scholar
  7. [7].Ciregan D. et al., “Multi-column deep neural networks for image classification,” in CVPR, 2012.Google ScholarGoogle Scholar
  8. [8].Collobert R. et al., “Torch7: A Matlab-like Environment for Machine Learning,” in BigLeam, NIPS Workshop, 2011.Google ScholarGoogle Scholar
  9. [9].Courbariaux M. et al., “Binaryconnect: Training deep neural networks with binary weights during propagations;” in NIPS, 2015.Google ScholarGoogle Scholar
  10. [10].Dong X. et al., “NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory,” TCAD, 2012.Google ScholarGoogle Scholar
  11. [11].Farabet C. et al., “CNP: An FPGA-based processor for Convolutional Networks,” in FPL, 2009.Google ScholarGoogle Scholar
  12. [12].Gao M. et al., “TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory,” in ASPLOS, 2017.Google ScholarGoogle Scholar
  13. [13].Gu P. et al., “Technological exploration of RRAM crossbar array for matrix-vector multiplication,” in ASPDAC, 2015.Google ScholarGoogle Scholar
  14. [14].Han S. et al., “MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints,” in MobiSys, 2016.Google ScholarGoogle Scholar
  15. [15].He K. et al., “Deep Residual Learning for Image Recognition;” in CVPR, 2016.Google ScholarGoogle Scholar
  16. [16].Hu M. et al., “Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication;” in DAC, 2016.Google ScholarGoogle Scholar
  17. [17].Jiang L. et al., “Enhancing Phase Change Memory Lifetime through Fine-Grained Current Regulation and Voltage Upscaling,” in ISLPED, 2011.Google ScholarGoogle Scholar
  18. [18].Jiang L. et al., “XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs,” in ISLPED, 2017.Google ScholarGoogle Scholar
  19. [19].Krizhevsky A. et al., “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012.Google ScholarGoogle Scholar
  20. [20].Lecun Y. et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 86 (11), Nov 1998.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21].Lee S.E. et al., “Accelerating mobile augmented reality on a handheld platform,” in ICCD, 2009.Google ScholarGoogle Scholar
  22. [22].Midya R. et al., “Anatomy of Ag/Hafnia-Based Selectors with 1010 Nonlinearity,” Advanced Materials, 29 (12), 2017.Google ScholarGoogle Scholar
  23. [23].Moons B. et al., “Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI,” in ISSCC, 2017.Google ScholarGoogle Scholar
  24. [24].Murmann B., “An ADC Performance, Power and Area Survey from 1997 to 2017,” http://web.stanford.edu/-murmann/adcsurvey.htmlGoogle ScholarGoogle Scholar
  25. [25].Ni L. et al., “An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism,” JESSCDC, 2017.Google ScholarGoogle Scholar
  26. [26].Rastegari M. et al., “XNOR-Net: Imagenet Classification Using Binary Convolutional Neural Networks,” in ECCV, 2016.Google ScholarGoogle Scholar
  27. [27].Sampson A. and Buckler M., “FODLAM: a first-order deep learning accelerator model,” https://github.com/cucapra/fodlamGoogle ScholarGoogle Scholar
  28. [28].Shafiee A. et al., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” in ISCA, 2016.Google ScholarGoogle Scholar
  29. [29].Simard P.Y. et al., “Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis,” in ICDAR, 2003.Google ScholarGoogle Scholar
  30. [30].Tang T. et al., “Binary convolutional neural network on RRAM,” in ASP-DAC, 2017.Google ScholarGoogle Scholar
  31. [31].Wen W. et al., “Speeding up crossbar resistive memory by exploiting in-memory data patterns,” in ICCAD, 2017.Google ScholarGoogle Scholar
  32. [32].Wong H.S.P. et al., “Metal-Oxide RRAM;” Proceedings of the IEEE, 2012.Google ScholarGoogle Scholar
  33. [33].Xu C., et al., “Overcoming the challenges of crossbar resistive memory architectures;” in HPCA, 2015.Google ScholarGoogle Scholar
  34. [34].Liu T.Y., et al., “A 130.7-mm22-Layer 32-Gb ReRAM Memory Device in 24-nm Technology,” JSSC, 2014.Google ScholarGoogle Scholar
  35. [35].Zhang C. et al., “Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks,” in FPGA, 2015.Google ScholarGoogle Scholar
  36. [36].Zhou P. et al., “A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology,” in ISCA, 2009.Google ScholarGoogle Scholar

Index Terms

  1. 3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image Guide Proceedings
              2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
              Nov 2018
              939 pages

              Copyright © 2018

              Publisher

              IEEE Press

              Publication History

              • Published: 5 November 2018

              Permissions

              Request permissions about this article.

              Request Permissions

              Qualifiers

              • research-article