ABSTRACT
Video and image content has begun to play a growing role in many applications, ranging from video games to autonomous self-driving vehicles. In this paper, we present accelerators for gist-based scene recognition, saliency-based attention, and HMAX-based object recognition that have multiple uses and are based on the current understanding of the vision systems found in the visual cortex of the mammalian brain. By integrating them into a two-level hierarchical system, we improve recognition accuracy and reduce computational time. Results of our accelerator prototype on a multi-FPGA system show real-time performance and high recognition accuracy with large speedups over existing CPU, GPU and FPGA implementations.
- Caltech 101 Database for Object Classification. http://www.vision.caltech.edu/Image_Datasets/Caltech101/.Google Scholar
- Dinigroup DNSEAM-PCIE. http://www.dinigroup.com/new/DNSEAM_PCIE.php.Google Scholar
- Dinigroup DNV6F6--PCIE Documentation. http://www.dinigroup.com/product/data/DNV6F6PCIe/files/DNV6F6PCIe_v14_lo.pdf.Google Scholar
- Stanford Dataset for Scene Classification. http://vision.stanford.edu/fmriscenes/resources.html.Google Scholar
- USC iLab for GIST C++ Implementation. http://ilab.usc.edu/toolkit/documentation.shtml.Google Scholar
- C. Ackerman and L. Itti. Robot steering with spectral image information. Robotics, IEEE Transactions on, 21(2):247--251, april 2005. Google ScholarDigital Library
- P. Akselrod, F. Zhao, I. Derekli, C. Farabet, B. Martini, Y. LeCun, and E. Culurciello. Hardware accelerated visual attention algorithm. In Information Sciences and Systems (CISS), 2011 45th Annual Conference on, pages 1--6, march 2011.Google ScholarCross Ref
- S. Bae, Y. Cho, S. Park, K. M. Irick, Y. Jin, and V. Narayanan. An FPGA implementation of information theoretic visual-saliency system and its optimization. In Intl. Symp. on Field Programmable Custom Computing Machines, FCCM, pages 41--48, 2011. Google ScholarDigital Library
- I. Biederman. Do background depth gradients facilitate object identification. Perception, 10:573--578, 1982.Google ScholarCross Ref
- M. DeBole, A. Maashri, M. Cotter, C.-L. Yu, C. Chakrabarti, and V. Narayanan. A Framework for Accelerating Neuromorphic-Vision Algorithms on FPGAs. In Computer-Aided Design (ICCAD), 2011 IEEE/ACM International Conference on, nov. 2011. Google ScholarDigital Library
- C. Farabet, B. Martini, P. Akselrod, S. Talay, Y. LeCun, and E. Culurciello. Hardware accelerated convolutional neural networks for synthetic vision systems. pages 257--260, may. 2010.Google Scholar
- L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(11):1254--1259, nov. 1998. Google ScholarDigital Library
- S. Kestur, M. Park, J. Sabarad, D. Dantara, V. Narayanan, Y. Chen, and D. Khosla. Emulating Mammalian Vision on Reconfigurable Hardware. In Intl. Symp. on Field Programmable Custom Computing Machines FCCM'12, May 2012. Google ScholarDigital Library
- A. Maashri, M. DeBole, M. Cotter, N. Chandramoorthy, Y. Xiao, V. Narayanan, and C. Chakrabarti. Accelerating neuromorphic vision algorithms for recognition. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 579--584, june 2012. Google ScholarDigital Library
- J. Mutch, U. Knoblich, and T. Poggio. CNS: a GPU-based framework for simulating cortically-organized networks. Technical Report MIT-CSAIL-TR-2010-013/CBCL-286, Massachusetts Institute of Technology, Cambridge, MA, February 2010.Google Scholar
- J. Mutch and D. G. Lowe. Object class recognition and localization using sparse features with limited receptive fields. Intl. J. Comput. Vision, 80:45--57, October 2008. Google ScholarDigital Library
- A. Oliva and P. Schyns. Coarse blobs or fine edges? evidence that information diagnosticity changes the perception of complex visual stimuli. Cognit Psychol, 34(1):72--107, 1997.Google ScholarCross Ref
- M. Park, S. Kestur, J. Sabarad, V. Narayanan, and M. Irwin. An FPGA-based Accelerator for Cortical Object Classification. In Proc. of Design Automation and Test Conference and Exhibition DATE'12, Mar 2012. Google ScholarDigital Library
- J. Sabarad, S. Kestur, M. Park, D. Dantara, V. Narayanan, Y. Chen, and D. Khosla. A Reconfigurable Accelerator for Neuromorphic Object Recognition. In Proc. of Asia South Pacific Design Automation Conference ASPDAC'12, Jan 2012.Google ScholarCross Ref
- T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. Pattern Analysis and Machine Intelligence, IEEE Tran on, 29(3):411--426, march 2007. Google ScholarDigital Library
- C. Siagian and L. Itti. Rapid biologically-inspired scene classification using features shared with visual attention. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 29(2):300--312, feb. 2007. Google ScholarDigital Library
- D. Thomas and W. Luk. Fpga accelerated simulation of biologically plausible spiking neural networks. In Field Programmable Custom Computing Machines, 2009. FCCM '09. 17th IEEE Symposium on, pages 45--52, april 2009. Google ScholarDigital Library
- A. Torralba. Modeling global scene factors in attention. JOSA - A, 20:1407--1418, 2003.Google Scholar
- T. Xu, T. Pototschnig, K. Kühnlenz, and M. Buss. A high-speed multi-gpu implementation of bottom-up attention using cuda. In ICRA'09: Proceedings of the 2009 IEEE international conference on Robotics and Automation, pages 1120--1126, Piscataway, NJ, USA, 2009. IEEE Press. Google ScholarDigital Library
Index Terms
- Accelerators for biologically-inspired attention and recognition
Recommendations
Subject-Independent Facial Expression Recognition with Biologically Inspired Features
ICMLA '12: Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 01Despite of much research for facial expression recognition, recognizing facial expressions across different persons is still a challenging computer vision task. However, facial expression analysis seems naturally for human visual system. Motivated by ...
Visual attention analysis and prediction on human faces
Human faces are almost always the focus of visual attention because of the rich semantic information therein. While some visual attention models incorporating face cues indeed perform better in images with faces, yet there is no systematic analysis of ...
A biologically inspired neurocomputing circuit for image representation
Biological vision systems have evolved over millions of years, resulting in complex neural structures for representation and processing of stimuli. Moreover, biological visual systems are typically far more efficient than current human-made machine ...
Comments