Object Classification Using a Fragment-Based Representation

Ullman, Shimon; Sali, Erez

doi:10.1007/3-540-45482-9_8

Object Classification Using a Fragment-Based Representation

Shimon Ullman⁷ &
Erez Sali⁷

Conference paper
First Online: 01 January 2002

745 Accesses
22 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1811))

Abstract

The tasks of visual object recognition and classification are natural and effortless for biological visual systems, but exceedingly difficult to replicate in computer vision systems. This difficulty arises from the large variability in images of different objects within a class, and variability in viewing conditions. In this paper we describe a fragment-based method for object classification. In this approach objects within a class are represented in terms of common image fragments, that are used as building blocks for representing a large variety of different objects that belong to a common class, such as a face or a car. Optimal fragments are selected from a training set of images based on a criterion of maximizing the mutual information of the fragments and the class they represent. For the purpose of classification the fragments are also organized into types, where each type is a collection of alternative fragments, such as different hairline or eye regions for face classification. During classification, the algorithm detects fragments of the different types, and then combines the evidence for the detected fragments to reach a final decision. The algorithm verifies the proper arrangement of the fragments and the consistency of the viewing conditions primarily by the conjunction of overlapping fragments. The method is different from previous part-based methods in using class-specific overlapping object fragments of varying complexity, and in verifying the consistent arrangement of the fragments primarily by the conjunction of overlapping detected fragments. Experimental results on the detection of face and car views show that the fragment-based approach can generalize well to completely novel image views within a class while maintaining low mis-classification error rates. We briefly discuss relationships between the proposed method and properties of parts of the primate visual system involved in object perception.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ASP associative processor, http://www.asp.co.il
Amit Y., Geman D., Wilder K., “Joint Induction of Shape Features and Tree Classifiers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 11, November 1997. **
Google Scholar
Bhat D., Nayar K. S., “Ordinal measures for image correspondence”, IEEE Trans. on PAMI Vol. 20 No. 4, (1998) 415–423.
Google Scholar
Biederman I., “Human image understanding: recent research and theory”, Computer Vision, Graphics and Image Processing, (1985) 32:29–73.
Article Google Scholar
Binford T. O. “Visual perception by computer”, IEEE conf. on systems and control 1971.
Google Scholar
Brooks R., “Symbolic reasoning among 3-D models and 2-D images”, Artificial intelligence (17) (1981) 285–348.
Article Google Scholar
Cootes T.F., Taylor C.J., Cooper D.H., Graham J., Active shape models — their training and applications. Computer Vision and Image Understanding, 61 (1995) 38–59.
Article Google Scholar
Cover, T.M. & Thomas, J.A. Elements of Information Theory. Wiley Series in Telecommunication, New York, 1991.
MATH Google Scholar
Edelman, S. Representing 3D objects by sets of activities of receptive fields 70, 37–45. Biological cybernitics, 70, (1993) 37–45.
Article MATH Google Scholar
Grimson W. E. L., Recognition of Object Families Using Parametrized Models, Proc. First International Conference on Computer Vision, (1987) 93–101.
Google Scholar
Grimson, E.W.L., & Lozano-Perez, T. Localizing overlapping parts by searching the interpretation tree. IEEE Trans. On Pattern Analysis and Machine Intelligence, 9 (1987) 469–482.
Article Google Scholar
Hubel, D. H., Wiesel, T. N. “Receptive fields and functional architecture of monkey striate cortex”, Journal of physiology, 195 (1968) 215–243.
Google Scholar
Logothetis N. K., Pauls J., Bülthoff H. H., Poggio T., “View-dependent object recognition in monkeys”, Current biology, 4 (1994) 401–414.
Article Google Scholar
Marr D., Vision, W.H. Freeman, San Francisco CA, 1982.
Google Scholar
Marr D., Nishihara H. K. “Representation and recognition of the spatial organization of three dimensional structure” Proceedings of the Royal Society of London B, 200 (1978) 269–294.
Google Scholar
Mel W. B., SEEMORE: “Combining color, shape and texture histogramming in a neurally inspired approach to visual object recognition”, Neural computation 9 (1997) 777–804.
Article Google Scholar
Minsky M. and Papert S., Perceptrons, The MIT Press, Cambridge Massachusetts, 1969.
MATH Google Scholar
Miyashita, Y. & Chang, H.S. Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature, 331, (1988) 68–70.
Article Google Scholar
Murase, H. & Nayar, S.K. Visual learning and recognition of 3-D objects from appearance. International J. of Com. Vision, 14 (1995) 5–24.
Article Google Scholar
Nelson C. R., and Selinger A., “A Cubist approach to object recognition”, ICCV-98 (1998) 614–621.
Google Scholar
Perret D. I., Rolls E. T. Caan W., “Visual neurons responsive to faces in the monkey temporal cortex”, Experimental brain research, 47 (1982) 329–342.
Article Google Scholar
Poggio T. and Sung K., “Finding human faces with a gaussian mixture distribution-base face model”, Computer analysis of image and patterns (1995) 432–439.
Google Scholar
Poggio, T. & Edelman, S. A network that learns to recognize three-dimensional objects. Nature, 343 (1990) 263–266.
Article Google Scholar
Rolls E. T., “Neurons in the cortex of the temporal lobe and in the amygdala of the monkey with responses selective for faces”, Human neurobiology, 3 (1984) 209–222.
Google Scholar
Rosch, E. Mervis, C.B., Gray, W.D., Johnson, S.M. & Boyes-Braem, P. Basic objects in natural carogories. Cognitive Psychology, 8 (1976) 382–439.
Article Google Scholar
Tanaka, K., “Neural mechanisms of object recognition”, Science, Vol. 262 (1993) 685–688.
Article Google Scholar
Turk M. and Pentland A., “Eigenfaces for recognition”, Cognitive Neuroscience, 3 (1990) 71–86.
Article Google Scholar
Ullman, S. & Basri, R. Recognition by linear combination of models. IEEE PAMI, 13(10) (1991) 992–1006.
Google Scholar
Vapnik, V. The Nature of Statistical Learning Theory. Springer, New York, 1995.
MATH Google Scholar
von der Heydt R., Peterhans E., Baumgartner G., “Illusory contours and cortical neuron responses”, Science, 224 (1984) 1260–1262.
Article Google Scholar

Download references

Author information

Authors and Affiliations

The Weizmann Institute of Science, Rehvot, 76100, Israel
Shimon Ullman & Erez Sali

Authors

Shimon Ullman
View author publications
You can also search for this author in PubMed Google Scholar
Erez Sali
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Artificial Vision Research, Korea University, Anam-dong, Seongbuk-ku, Seoul, 136-701, Korea
Seong-Whan Lee
Max-Planck-Institute for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Heinrich H. Bülthoff
Department of Brain and Cognitive Sciences Artificial Intelligence Laboratory, E25-218, Massachusetts Institute of Technology, 45 Carleton Street, Cambridge, MA, 02142, USA
Tomaso Poggio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ullman, S., Sali, E. (2000). Object Classification Using a Fragment-Based Representation. In: Lee, SW., Bülthoff, H.H., Poggio, T. (eds) Biologically Motivated Computer Vision. BMCV 2000. Lecture Notes in Computer Science, vol 1811. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45482-9_8

Download citation

DOI: https://doi.org/10.1007/3-540-45482-9_8
Published: 01 February 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67560-0
Online ISBN: 978-3-540-45482-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics