research-article

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs

Authors:
Qian Lou

Indiana University Bloomington

Indiana University Bloomington
View Profile

,
Wujie Wen

Florida International University

Florida International University
View Profile

,
Lei Jiang

Indiana University Bloomington

Indiana University Bloomington
View Profile

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)Nov 2018Pages 1–8https://doi.org/10.1145/3240765.3240767

Published:05 November 2018Publication History

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

Pages 1–8

ABSTRACT

It is extremely challenging to deploy computing-intensive convolutional neural networks (CNNs) with rich parameters in mobile devices because of their limited computing resources and low power budgets. Although prior works build fast and energy-efficient CNN accelerators by greatly sacrificing test accuracy, mobile devices have to guarantee high CNN test accuracy for critical applications, e.g., unlocking phones by face recognitions. In this paper, we propose a 3D XPoint ReRAM-based process-in-memory architecture, 3DICT, to provide various test accuracies to applications with different priorities by lookup-based CNN tests that dynamically exploit the trade-off between test accuracy and latency. Compared to the state-of-the-art accelerators, on average, 3DICT improves the CNN test performance per Watt by 13% ∼ 61× and guarantees 9-year endurance under various CNN test accuracy requirements.

References

[1].Andri R. et al., “YodaNN: An Ultra-Low Power CNN Accelerator Based on Binary Weights;” in ISVLSI, pages 236–241, July 2016.Google Scholar
[2].Bagherinezhad H. et al., “LCNN: Lookup-based Convolutional Neural Network,” in CVPR, 2017.Google Scholar
[3].Chakrabarti B. et al., “A multiply-add engine with monolithically integrated 3D memristor crossbar/CMOS hybrid circuit,” Scientific Reports, 7, 2017.Google Scholar
[4].Chen Y.H. et al., “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” in ISSCC, 2016.Google Scholar
[5].Chi P. et al., “PRIME: A Novel PIM Architecture for Neural Network Computation in ReRAM-Based Main Memory;” in ISCA, 2016.Google Scholar
[6].Nachiappan N. Chidambaram, et al., “GemDroid: A Framework to Evaluate Mobile Platforms,” in SIGMETRICS, 2014.Google Scholar
[7].Ciregan D. et al., “Multi-column deep neural networks for image classification,” in CVPR, 2012.Google Scholar
[8].Collobert R. et al., “Torch7: A Matlab-like Environment for Machine Learning,” in BigLeam, NIPS Workshop, 2011.Google Scholar
[9].Courbariaux M. et al., “Binaryconnect: Training deep neural networks with binary weights during propagations;” in NIPS, 2015.Google Scholar
[10].Dong X. et al., “NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory,” TCAD, 2012.Google Scholar
[11].Farabet C. et al., “CNP: An FPGA-based processor for Convolutional Networks,” in FPL, 2009.Google Scholar
[12].Gao M. et al., “TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory,” in ASPLOS, 2017.Google Scholar
[13].Gu P. et al., “Technological exploration of RRAM crossbar array for matrix-vector multiplication,” in ASPDAC, 2015.Google Scholar
[14].Han S. et al., “MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints,” in MobiSys, 2016.Google Scholar
[15].He K. et al., “Deep Residual Learning for Image Recognition;” in CVPR, 2016.Google Scholar
[16].Hu M. et al., “Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication;” in DAC, 2016.Google Scholar
[17].Jiang L. et al., “Enhancing Phase Change Memory Lifetime through Fine-Grained Current Regulation and Voltage Upscaling,” in ISLPED, 2011.Google Scholar
[18].Jiang L. et al., “XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs,” in ISLPED, 2017.Google Scholar
[19].Krizhevsky A. et al., “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012.Google Scholar
[20].Lecun Y. et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 86 (11), Nov 1998.Google ScholarCross Ref
[21].Lee S.E. et al., “Accelerating mobile augmented reality on a handheld platform,” in ICCD, 2009.Google Scholar
[22].Midya R. et al., “Anatomy of Ag/Hafnia-Based Selectors with 10¹⁰ Nonlinearity,” Advanced Materials, 29 (12), 2017.Google Scholar
[23].Moons B. et al., “Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI,” in ISSCC, 2017.Google Scholar
[24].Murmann B., “An ADC Performance, Power and Area Survey from 1997 to 2017,” http://web.stanford.edu/-murmann/adcsurvey.htmlGoogle Scholar
[25].Ni L. et al., “An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism,” JESSCDC, 2017.Google Scholar
[26].Rastegari M. et al., “XNOR-Net: Imagenet Classification Using Binary Convolutional Neural Networks,” in ECCV, 2016.Google Scholar
[27].Sampson A. and Buckler M., “FODLAM: a first-order deep learning accelerator model,” https://github.com/cucapra/fodlamGoogle Scholar
[28].Shafiee A. et al., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” in ISCA, 2016.Google Scholar
[29].Simard P.Y. et al., “Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis,” in ICDAR, 2003.Google Scholar
[30].Tang T. et al., “Binary convolutional neural network on RRAM,” in ASP-DAC, 2017.Google Scholar
[31].Wen W. et al., “Speeding up crossbar resistive memory by exploiting in-memory data patterns,” in ICCAD, 2017.Google Scholar
[32].Wong H.S.P. et al., “Metal-Oxide RRAM;” Proceedings of the IEEE, 2012.Google Scholar
[33].Xu C., et al., “Overcoming the challenges of crossbar resistive memory architectures;” in HPCA, 2015.Google Scholar
[34].Liu T.Y., et al., “A 130.7-mm²2-Layer 32-Gb ReRAM Memory Device in 24-nm Technology,” JSSC, 2014.Google Scholar
[35].Zhang C. et al., “Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks,” in FPGA, 2015.Google Scholar
[36].Zhou P. et al., “A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology,” in ISCA, 2009.Google Scholar

Index Terms

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs
1. Computer systems organization
2. Hardware

Index terms have been assigned to the content through auto-classification.

Recommendations

A frequent-value based PRAM memory architecture
ASPDAC '11: Proceedings of the 16th Asia and South Pacific Design Automation Conference

Phase Change Random Access Memory (PRAM) has great potential as the replacement of DRAM as main memory, due to its advantages of high density, non-volatility, fast read speed, and excellent scalability. However, poor endurance and high write energy ...
Read More
Initial experience with 3D XPoint main memory
Abstract
3D XPoint is the first commercially available main memory NVM solution targeting mainstream computer systems. Previous database studies on NVM memory evaluate their proposed techniques mainly on simulated or emulated NVM hardware. In this paper, ...
Read More
An Energy Efficient 3D-Heterogeneous Main Memory Architecture for Mobile Devices
MEMSYS '20: Proceedings of the International Symposium on Memory Systems

The demand for main memory capacity is ever increasing in mobile devices and embedded systems. Dynamic Random Access Memories (DRAMs) can not keep pace with the required main memory capacities because of the restrictions in improving the cell density ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
Nov 2018
939 pages

Copyright © 2018
Sponsors
In-Cooperation
Publisher
IEEE Press
Publication History
- Published: 5 November 2018
Permissions
Request permissions about this article.
Request Permissions
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 355
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

ABSTRACT

References

Cited By

Index Terms

Recommendations

A frequent-value based PRAM memory architecture

Initial experience with 3D XPoint main memory

An Energy Efficient 3D-Heterogeneous Main Memory Architecture for Mobile Devices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

Digital Edition

Caption

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

ABSTRACT

References

Cited By

Index Terms

Recommendations

A frequent-value based PRAM memory architecture

Initial experience with 3D XPoint main memory

An Energy Efficient 3D-Heterogeneous Main Memory Architecture for Mobile Devices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

Digital Edition

Share this Publication link

Share on Social Media