Paper
31 January 2020 Improving object recognition of CNNs with multiple queries and HMMs
László Czúni, Amr M. Nagy
Author Affiliations +
Proceedings Volume 11433, Twelfth International Conference on Machine Vision (ICMV 2019); 1143310 (2020) https://doi.org/10.1117/12.2559393
Event: Twelfth International Conference on Machine Vision, 2019, Amsterdam, Netherlands
Abstract
In our paper we combine neural networks with Hidden Markov Models for multiview object recognition. While convolutional neural networks are very efficient in object recognition there is still need for improvements in many practical cases. For example if the training is not satisfactory or the object localization is not solved with the neural network then information fusion from several images and from inertial sensors can still help a lot to improve recognition rate. In our use case we are to recognize objects from several directions with the VGG16 network. We assume that no localization of objects is possible on the images due to the lack of bounding box annotations, we have to recognize the objects even if they occupy only about 25% of the field of view. To overcome this problem we propose to use a Hidden Markov Model approach where the consecutive queries, shots taken from different viewing directions, are first evaluated with VGG16 inference and then with the Viterbi algorithm. The role of the later is to estimate the most probable sequence of poses of candidates (from the predefined 8 horizontal views in our experiments), thus we can select the most probable object. The approach, as evaluated with different number of queries over a set of 40 objects from the COIL-100 dataset, can result in significant increase of hit rate compared to one shot recognition or to combining individual shots without the HMM model.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
László Czúni and Amr M. Nagy "Improving object recognition of CNNs with multiple queries and HMMs", Proc. SPIE 11433, Twelfth International Conference on Machine Vision (ICMV 2019), 1143310 (31 January 2020); https://doi.org/10.1117/12.2559393
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object recognition

Cameras

3D modeling

Neural networks

Eye models

Information fusion

Data modeling

RELATED CONTENT

Matching sets of 3D segments
Proceedings of SPIE (September 23 1999)
Reconstruction of digital terrain model with a lake
Proceedings of SPIE (June 23 1993)
Stereo-based 3-D scene interpretation using semantic nets
Proceedings of SPIE (March 01 1992)
Extraction of shape-based properties
Proceedings of SPIE (October 06 1998)
Topology-independent shape modeling scheme
Proceedings of SPIE (June 23 1993)

Back to Top