Object Detection and Localization Using Local and Global Features

Murphy, Kevin; Torralba, Antonio; Eaton, Daniel; Freeman, William

doi:10.1007/11957959_20

Kevin Murphy²⁰,
Antonio Torralba²¹,
Daniel Eaton²⁰ &
…
William Freeman²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

3412 Accesses
77 Citations

Abstract

Traditional approaches to object detection only look at local pieces of the image, whether it be within a sliding window or the regions around an interest point detector. However, such local pieces can be ambiguous, especially when the object of interest is small, or imaging conditions are otherwise unfavorable. This ambiguity can be reduced by using global features of the image — which we call the “gist” of the scene — as an additional source of evidence. We show that by combining local and global features, we get significantly improved detection rates. In addition, since the gist is much cheaper to compute than most local detectors, we can potentially gain a large increase in speed as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)
Article Google Scholar
Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)
Chapter Google Scholar
Biederman, I.: On the semantics of a glance at a scene. In: Kubovy, M., Pomerantz, J. (eds.) Perceptual organization, pp. 213–253. Erlbaum, Mahwah (1981)
Google Scholar
Bishop, C.M.: Mixture density networks. Technical Report NCRG 4288, Neural Computing Research Group, Department of Computer Science, Aston University (1994)
Google Scholar
Bouchard, G., Triggs, B.: A hierarchical part-based model for visual object categorization. In: CVPR (2005)
Google Scholar
Csurka, G., Dance, C., Bray, C., Fan, L., Willamowski, J.: Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning in computer vision (2004)
Google Scholar
Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)
Chapter Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Intl. J. Computer Vision 61(1) (2005)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of statistics 28(2), 337–374 (2000)
Article MATH MathSciNet Google Scholar
Fink, M., Perona, P.: Mutual boosting for contextual influence. In: Advances in Neural Info. Proc. Systems (2003)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: CVPR (2005)
Google Scholar
Friedman, J.: Greedy function approximation: a gradient boosting machine. Annals of Statistics 29, 1189–1232 (2001)
Article MATH MathSciNet Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: IEEE Conf. on Computer Vision and Pattern Recognition (2005)
Google Scholar
Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)
Article MATH Google Scholar
He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale conditional random fields for image labelling. In: CVPR (2004)
Google Scholar
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6, 181–214 (1994)
Article Google Scholar
Lienhart, R., Kuranov, A., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: DAGM 25th Pattern Recognition Symposium (2003)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Intl. J. Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(4), 349–361 (2001)
Article Google Scholar
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic (May 2004)
Google Scholar
Murphy, K., Torralba, A., Freeman, W.: Using the forest to see the trees: a graphical model relating features, objects and scenes. In: Advances in Neural Info. Proc. Systems (2003)
Google Scholar
Navon, D.: Forest before the trees: the precedence of global features in visual perception. Cognitive Psychology 9, 353–383 (1977)
Article Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Intl. J. Computer Vision 42(3), 145–175 (2001)
Article MATH Google Scholar
Papageorgiou, C., Poggio, T.: A trainable system for object detection. Intl. J. Computer Vision 38(1), 15–33 (2000)
Article MATH Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola, A., Bartlett, P., Schoelkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)
Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Human face detection in visual scenes. In: Advances in Neural Info. Proc. Systems, vol. 8 (1995)
Google Scholar
Schneiderman, H., Kanade, T.: A statistical model for 3D object detection applied to faces and cars. In: CVPR (2000)
Google Scholar
Schyns, P., Oliva, A.: From blobs to boundary edges: Evidence for time and spatial scale dependent scene recognition. Psychological Science 5, 195–200 (1994)
Article Google Scholar
Serre, T., Wolf, L., Poggio, T.: A new biologically motivated framework for robust object recognition. In: CVPR (2005)
Google Scholar
Singhal, A., Luo, J., Zhu, W.: Probabilistic spatial context models for scene content understanding. In: CVPR (2003)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Contextual models for object detection using boosted random fields. In: Advances in Neural Info. Proc. Systems (2004)
Google Scholar
Torralba, A., Murphy, K., Freeman, W., Rubin, M.: Context-based vision system for place and object recognition. In: Intl. Conf. Computer Vision (2003)
Google Scholar
Torralba, A., Oliva, A.: Depth estimation from image structure. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(9), 1225 (2002)
Article Google Scholar
Torralba, A.: Contextual priming for object detection. Intl. J. Computer Vision 53(2), 153–167 (2003)
Article Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. Intl. J. Computer Vision 57(2), 137–154 (2004)
Article Google Scholar
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)
Google Scholar
Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of British Columbia,
Kevin Murphy & Daniel Eaton
Computer Science and AI Lab, MIT,
Antonio Torralba & William Freeman

Authors

Kevin Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Torralba
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Eaton
View author publications
You can also search for this author in PubMed Google Scholar
William Freeman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Murphy, K., Torralba, A., Eaton, D., Freeman, W. (2006). Object Detection and Localization Using Local and Global Features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_20

Download citation

DOI: https://doi.org/10.1007/11957959_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics