ABSTRACT
On-chip learning with compute-in-memory (CIM) paradigm has become popular in machine learning hardware design in the recent years. However, it is hard to achieve high on-chip learning accuracy due to the high nonlinearity in the weight update curve of emerging nonvolatile memory (eNVM) based analog synapse devices. Although digital synapse devices offer good learning accuracy, the row-by-row partial sum accumulation leads to high latency. In this paper, the methods to solve the aforementioned issues are presented with a device-to-algorithm level optimization. For analog synapses, novel hybrid precision synapses with good linearity and more advanced training algorithms are introduced to increase the on-chip learning accuracy. The latency issue for digital synapses can be solved by using parallel partial sum read-out scheme. All these features are included into the recently released MLP + NeuroSimV3.0, which is an in-house developed device-to-system evaluation framework for neuro-inspired accelerators based on CIM paradigm.
- S. Yu, "Neuro-inspired computing with emerging nonvolatile memorys," in Proceedings of the IEEE, vol. 106, no. 2, pp. 260--285, Feb. 2018.Google ScholarCross Ref
- X. Si, J.-J. Chen, Y.-N. Tu, W.-H. Huang, J.-H. Wang, W.-C. Wei, S.-Y. Wu, X. Sun, R. Liu, S. Yu, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, Q. Li, M.-F. Chang, "A twin-8T SRAM computation-in-memory macro for multiple-bit CNN-based machine learning," IEEE International Solid-State Circuits Conference (ISSCC) 2019Google Scholar
- W. Chen, K. Li, W. Lin, K. Hsu, P. Li, C. Yang, C. Xue, E. Yang, Y. Chen, Y. Chang, T. Hsu, Y. King, C. Lin, R. Liu, C. Hsieh, K. Tang and M. Chang., "A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors," 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA, 2018, pp. 494--496.Google Scholar
- R. Liu, X. Peng, X. Sun, W.S. Khwa, X. Si, J. J. Chen and S. Yu, "Parallelizing SRAM arrays with customized bit-cell for binary neural networks", In Proceedings of the 55th Annual Design Automation Conference (p. 21). ACM, Jun. 2018. Google ScholarDigital Library
- L. Song, X. Qian, H. Li and Y. Chen, "PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, 2017, pp. 541--552.Google Scholar
- H. Ji, L. Song, L. Jiang, H. H. Li and Y. Chen, "ReCom: An efficient resistive accelerator for compressed deep neural networks," 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, 2018, pp. 237--240.Google Scholar
- P. Chen, X. Peng and S. Yu, "NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 12, pp. 3067--3080, Dec. 2018.Google ScholarDigital Library
- P. Chen, X. Peng and S. Yu, "NeuroSim+: An integrated device-to-algorithm framework for benchmarking synaptic devices and array architectures," 2017 IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, 2017, pp. 6.1.1--6.1.4.Google Scholar
- https://github.com/neurosim/MLP_NeuroSim_V3.0Google Scholar
- Y. Li, S. Kim, X.Sun, P. Solomon, T. Gokman, H. Tsai, S. Koswatta, Z. Ren, R. Mo, C. Yeh, W. Haensch and E. Leobandung, "Capacitor-based Cross-point Array for Analog Neural Network with Record Symmetry and Linearity," 2018 IEEE Symposium on VLSI Technology, Honolulu, HI, 2018, pp. 25--26.Google Scholar
- S. Ambrogio, P. Narayanan, H. Tsai, R. M. Shelby, I. Boybat, C. Nolfo, S. Sidler, M. Giordano, M. Bodini, N. C. Farinha, B. Killeen, C. Cheng, Y. Jaoudi and G. W. Burr, "Equivalent-accuracy accelerated neural-network training using analogue memory". Nature, 558(7708), p.60, 2018Google ScholarCross Ref
- X. Sun, P. Wang, K. Ni, S. Datta and S. Yu, "Exploiting Hybrid Precision for Training and Inference: A 2T-1FeFET Based Analog Synaptic Weight Cell," 2018 IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, 2018, pp. 3.1.1--3.1.4.Google Scholar
- M. Jerry P. Chen, J. Zhang, P. Sharma, K. Ni, S.Yu and S. Datta, "Ferroelectric FET analog synapse for acceleration of deep neural network training," 2017 IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, 2017, pp. 6.2.1--6.2.4.Google Scholar
- D. Kuzum, R. Jeyasingh, B. Lee and H.-S.P.Wong, "Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing," Nano Letters, 12.5 (2011): 2179--2186.Google ScholarCross Ref
- http://ruder.io/optimizing-gradient descent/index.html#momentumGoogle Scholar
- S. H. Jo, T. Chang, I. Ebong, B. Bhavitavya, P. Mazumder and W. Lu. "Nanoscale memristor device as synapse in neuromorphic systems." Nano Letters, 10.4 (2010): 1297--1301.Google ScholarCross Ref
- S. Choi, S. Tan, Z. Li, Y. Kim, C. Choi, P. Chen, H. Yeon, S. Yu and J. Kim. "SiGe epitaxial memory for neuromorphic computing with reproducible high performance based on engineered dislocations." Nature Materials, 17.4 (2018): 335--340.Google ScholarCross Ref
- S. Wu, G. Li, F. Chen and L. Shi, "Training and inference with integers in deep neural networks," arXiv preprint (2018):1802:04680.Google Scholar
- P.-Y. Chen, S. Yu, "Technological benchmark of analog synaptic devices for neuro-inspired architectures," IEEE Design & Test, 2019Google Scholar
- W. Wu, H. Wu, B. Gao, P. Yao, X. Zhang, X. Peng, S. Yu and H. Qiang. "A methodology to improve linearity of analog RRAM for neuromorphic computing." IEEE Symposium on VLSI Technology, 2018.Google Scholar
- S. Park, A. Sheri, J. Kim, J. Noh, J. Jang, M. Jeon, B. Lee, B. R. Lee, B. H. Lee, and H. Hwang, "Neuromorphic speech systems using advanced ReRAM-based synapse." IEEE International Electron Devices Meeting, 2013.Google Scholar
- J. Woo, K. Moon, J. Song, S. Lee, M. Kwak, J. Park and H. Hwang, "Improved synaptic behavior under identical pulses using AlOx/HfO2 bilayer RRAM array for neuromorphic systems." IEEE Electron Device Letters, 37.8 (2016): 994--997.Google ScholarCross Ref
Index Terms
- MLP+NeuroSimV3.0: Improving On-chip Learning Performance with Device to Algorithm Optimizations
Recommendations
Effects of Noise on Leaky Integrate-and-Fire Neuron Models for Neuromorphic Computing Applications
Computational Science and Its Applications – ICCSA 2022AbstractArtificial neural networks (ANNs) have been extensively used for the description of problems arising from biological systems and for constructing neuromorphic computing models. The third generation of ANNs, namely, spiking neural networks (SNNs), ...
Interactive continual learning for robots: a neuromorphic approach
ICONS '22: Proceedings of the International Conference on Neuromorphic Systems 2022Intelligent robots need to recognize objects in their environment. This task is conceptually different from the typical image classification task in computer vision. Robots need to recognize particular object instances, not classes of objects, which ...
Context-Dependent Computations in Spiking Neural Networks with Apical Modulation
Artificial Neural Networks and Machine Learning – ICANN 2023AbstractNeocortical pyramidal neurons integrate two distinct streams of information. Bottom-up information arrives at their basal dendrites, and resulting neuronal activity is modulated by top-down input that targets the apical tufts of these neurons and ...
Comments