Abstract
For the past few years, Deep Neural Networks (DNNs) have achieved state-of-art performance in numerous challenging domains. To reach this performance, DNNs consist in large sets of parameters and complex architectures, which are trained offline on huge datasets. The complexity and size of DNNs architectures make it difficult to implement such approaches for budget-restricted applications such as embedded systems. Furthermore, DNNs cannot learn incrementally new data, without forgetting previously acquired knowledge, which makes embedded applications even more challenging due to the need of storing the whole dataset. To tackle this problem, we introduce an incremental learning method that combines pre-trained DNNs, binary associative memories, and product quantizing (PQ) as a bridge between them. The obtained method requires less computational power and memory requirements, and reaches good performances on challenging vision datasets. Moreover, we present a hardware implementation validated on a FPGA target, which uses few hardware resources, while providing substantial processing acceleration compared to a CPU counterpart.
Similar content being viewed by others
References
Wu, J., Leng, C., Wang, Y., et al. (2016). Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4820–4828).
Gong, Y., Liu, L., Yang, M., et al. (2014). Compressing deep convolutional networks using vector quantization. arXiv:1412.6115.
Courbariaux, M., Bengio, Y., David, J.-P. (2015). Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems (pp. 3123–3131).
Soulié, G., Gripon, V., Robert, M. (2016). Compression of deep neural networks on the fly. In International Conference on Artificial Neural Networks (pp. 153–160). Cham: Springer.
Suda, N., Chandra, V., Dasika, G., et al. (2016). Throughput-optimized openCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (pp. 16–25): ACM.
Bo, G.M., Caviglia, D.D., Valle, M. (2000). An on-chip learning neural network. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000. IJCNN 2000 (pp. 66–71): IEEE.
Choi, Y., EL-Khamy, M., Lee, J. (2016). Towards the limit of network quantization. arXiv:1612.01543.
Sun, Y., Tang, K., Minku, L.L., et al. (2016). Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng., 28(6), 1532–1545.
Syed, N.A., Huan, S., Kah, L., et al. (1999). Incremental learning with support vector machines.
Cauwenberghs, G., & Poggio, T. (2001). Incremental and decremental support vector machine learning. In Advances in Neural Information Processing Systems (pp. 409–415).
Lomonaco, V., & Maltoni, D. (2016). Comparing incremental learning strategies for convolutional neural networks. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition (pp. 175–184). Cham: Springer.
Zheng, J., Shen, F., Fan, H., et al. (2013). An online incremental learning support vector machine for large-scale data. Neural Comput. & Applic., 22(5), 1023–1035.
Szegedy, C., Vanhoucke, V., Ioffe, S., et al. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2818–2826).
Oquab, M., Bottou, L., Laptev, I., et al. (2014). Learning and transferring mid-level image representations using convolutional neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1717–1724): IEEE.
Pan, S.J., & Yang, Q. (2010). A survey on transfer learning. IEEE Trans. Knowl. Data Eng., 22(10), 1345–1359.
Gripon, V., & Berrou, C. (2011). Sparse neural networks with large learning diversity. IEEE Trans. Neural Netw., 22(7), 1087–1096.
Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097–1105).
Jegou, H., Douze, M., Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell., 33(1), 117–128.
Hong, S., You, T., Kwak, S., et al. (2015). Online tracking by learning discriminative saliency map with convolutional neural network. In International Conference on Machine Learning (pp. 597–606).
Goodfellow, I. J., Mirza, M., Xiao, D., et al. (2013). An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv:1312.6211.
Russakovsky, O., Deng, J., Su, H., et al. (2015). Imagenet large scale visual recognition challenge. Int. J. Comput. Vis., 115(3), 211–252.
Girshick, R., Donahue, J., Darrell, T., et al. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 580–587).
Kasabov, N. (2013). Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines. Berlin: Springer Science & Business Media.
French, R.M. (1999). Catastrophic forgetting in connectionist networks. Trends Cogn. Sci., 3(4), 128–135.
Polikar, R., Upda, L., Upda, S.S., et al. (2001). Learn++: an incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 31(4), 497–508.
Polikar, R., Udpa, L., Udpa, S.S., et al. (2000). Learn++: an incremental learning algorithm for multilayer perceptron networks. In 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000. ICASSP?00. Proceedings (pp. 3414–3417).
Qiu, J., Wang, J., Yao, S., et al. (2016). Going deeper with embedded fpga platform for convolutional neural network. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (pp. 26–35): ACM.
Iandola, F.N., Han, S., Moskewicz, M.W., et al. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv:1602.07360.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
Hinton, G., Deng, L., Yu, D., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag., 29(6), 82–97.
Graham, B. (2014). Fractional max-pooling. arXiv:1412.6071.
Pentina, A., Sharmanska, V., Lampert, C.H. (2015). Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5492– 5500).
Kuzborskij, I., Orabona, F., Caputo, B. (2013). From n to n + 1: Multiclass transfer incremental learning. In 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3358–3365): IEEE.
Rebuffi, S.-A., Kolesnikov, A., Lampert, C.H. (2017). iCaRL: Incremental classifier and representation learning. In Proc. CVPR.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Boukli Hacene, G., Gripon, V., Farrugia, N. et al. Budget Restricted Incremental Learning with Pre-Trained Convolutional Neural Networks and Binary Associative Memories. J Sign Process Syst 91, 1063–1073 (2019). https://doi.org/10.1007/s11265-019-01450-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-019-01450-z