Skip to main content

Object Detection and Localization Using Local and Global Features

  • Chapter
Book cover Toward Category-Level Object Recognition

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

Abstract

Traditional approaches to object detection only look at local pieces of the image, whether it be within a sliding window or the regions around an interest point detector. However, such local pieces can be ambiguous, especially when the object of interest is small, or imaging conditions are otherwise unfavorable. This ambiguity can be reduced by using global features of the image — which we call the “gist” of the scene — as an additional source of evidence. We show that by combining local and global features, we get significantly improved detection rates. In addition, since the gist is much cheaper to compute than most local detectors, we can potentially gain a large increase in speed as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)

    Article  Google Scholar 

  2. Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Biederman, I.: On the semantics of a glance at a scene. In: Kubovy, M., Pomerantz, J. (eds.) Perceptual organization, pp. 213–253. Erlbaum, Mahwah (1981)

    Google Scholar 

  4. Bishop, C.M.: Mixture density networks. Technical Report NCRG 4288, Neural Computing Research Group, Department of Computer Science, Aston University (1994)

    Google Scholar 

  5. Bouchard, G., Triggs, B.: A hierarchical part-based model for visual object categorization. In: CVPR (2005)

    Google Scholar 

  6. Csurka, G., Dance, C., Bray, C., Fan, L., Willamowski, J.: Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning in computer vision (2004)

    Google Scholar 

  7. Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Intl. J. Computer Vision 61(1) (2005)

    Google Scholar 

  9. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of statistics 28(2), 337–374 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  10. Fink, M., Perona, P.: Mutual boosting for contextual influence. In: Advances in Neural Info. Proc. Systems (2003)

    Google Scholar 

  11. Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: CVPR (2005)

    Google Scholar 

  12. Friedman, J.: Greedy function approximation: a gradient boosting machine. Annals of Statistics 29, 1189–1232 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  13. Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: IEEE Conf. on Computer Vision and Pattern Recognition (2005)

    Google Scholar 

  14. Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)

    Article  MATH  Google Scholar 

  15. He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale conditional random fields for image labelling. In: CVPR (2004)

    Google Scholar 

  16. Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6, 181–214 (1994)

    Article  Google Scholar 

  17. Lienhart, R., Kuranov, A., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: DAGM 25th Pattern Recognition Symposium (2003)

    Google Scholar 

  18. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Intl. J. Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  19. Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(4), 349–361 (2001)

    Article  Google Scholar 

  20. Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic (May 2004)

    Google Scholar 

  21. Murphy, K., Torralba, A., Freeman, W.: Using the forest to see the trees: a graphical model relating features, objects and scenes. In: Advances in Neural Info. Proc. Systems (2003)

    Google Scholar 

  22. Navon, D.: Forest before the trees: the precedence of global features in visual perception. Cognitive Psychology 9, 353–383 (1977)

    Article  Google Scholar 

  23. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Intl. J. Computer Vision 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  24. Papageorgiou, C., Poggio, T.: A trainable system for object detection. Intl. J. Computer Vision 38(1), 15–33 (2000)

    Article  MATH  Google Scholar 

  25. Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola, A., Bartlett, P., Schoelkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)

    Google Scholar 

  26. Rowley, H.A., Baluja, S., Kanade, T.: Human face detection in visual scenes. In: Advances in Neural Info. Proc. Systems, vol. 8 (1995)

    Google Scholar 

  27. Schneiderman, H., Kanade, T.: A statistical model for 3D object detection applied to faces and cars. In: CVPR (2000)

    Google Scholar 

  28. Schyns, P., Oliva, A.: From blobs to boundary edges: Evidence for time and spatial scale dependent scene recognition. Psychological Science 5, 195–200 (1994)

    Article  Google Scholar 

  29. Serre, T., Wolf, L., Poggio, T.: A new biologically motivated framework for robust object recognition. In: CVPR (2005)

    Google Scholar 

  30. Singhal, A., Luo, J., Zhu, W.: Probabilistic spatial context models for scene content understanding. In: CVPR (2003)

    Google Scholar 

  31. Torralba, A., Murphy, K., Freeman, W.: Contextual models for object detection using boosted random fields. In: Advances in Neural Info. Proc. Systems (2004)

    Google Scholar 

  32. Torralba, A., Murphy, K., Freeman, W., Rubin, M.: Context-based vision system for place and object recognition. In: Intl. Conf. Computer Vision (2003)

    Google Scholar 

  33. Torralba, A., Oliva, A.: Depth estimation from image structure. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(9), 1225 (2002)

    Article  Google Scholar 

  34. Torralba, A.: Contextual priming for object detection. Intl. J. Computer Vision 53(2), 153–167 (2003)

    Article  Google Scholar 

  35. Viola, P., Jones, M.: Robust real-time object detection. Intl. J. Computer Vision 57(2), 137–154 (2004)

    Article  Google Scholar 

  36. Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)

    Google Scholar 

  37. Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Murphy, K., Torralba, A., Eaton, D., Freeman, W. (2006). Object Detection and Localization Using Local and Global Features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_20

Download citation

  • DOI: https://doi.org/10.1007/11957959_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68794-8

  • Online ISBN: 978-3-540-68795-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics