VC-Dimension Analysis of Object Recognition Tasks

Lindenbaum, Michael; Ben-David, Shai

doi:10.1023/A:1008314532315

VC-Dimension Analysis of Object Recognition Tasks

Published: January 1999

Volume 10, pages 27–49, (1999)
Cite this article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Michael Lindenbaum¹ &
Shai Ben-David¹

85 Accesses
1 Citation
Explore all metrics

Abstract

We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects.

We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library.

Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

H.S. Baird, Model-Based Image Matching using Location, MIT Press: Cambridge, MA, 1985.
Google Scholar
S. Ben-David, G. Benedek, and Y. Mansour, “A parametrization scheme for classifying models of learnability,” Information and Computation, Vol. 120, No. 1, pp. 11–21, 1995.
Google Scholar
S. Ben-David and M. Lindenbaum, “Localization vs. identification of semi-algebraic sets,” in Proceedings of the 6th ACM Conference on Computational Learning Theory, 1993, pp. 327–336. See also: “Localization vs. identification of semi-algebraic sets,” Machine Learning, Vol. 32, pp. 207-224, 1998.
S. Ben-David and M. Lindenbaum, “Learning distributions by their density levels–A paradigm for learning without a teacher,” in Proc. of the Second European Conference on Computational Learning Theory, 1995. See also: “Learning distributions by their density levels–A paradigm for learning without a teacher,” Journal of Computer and System Science, Vol. 55, No. 1, pp. 171-181, 1987.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth, “Learnability and the Vapnik-Chervonenkis dimension,” JACM, Vol. 36, No. 4, pp. 929–965, 1989.
Google Scholar
J.B. Burns, R.S. Weiss, and E.M. Riseman, “The non-existence of general-case view-invariants,” in Geometric Invariance in Computer Vision, J.L Mundy and A.P. Zisserman (Eds.), MIT Press, 1992.
A.C. Cass, “Feature matching for object localization in the presence of uncertainty,” in Proc. 3rd Int. Conf. on Comp.Vis., Osaka, 1991, pp. 360–364.
D. Clemens and D. Jacobs, “Space and time bounds on indexing 3D models from 2D images,” IEEE Trans. on Pattern Analysis and Machine Intelligence,Vol. 13, No. 10, pp. 1007–1017, 1991.
Google Scholar
G.E. Collins, “Quantifier elimination for real closed fields by cylindrical algebraic decomposition,” in Proceedings of the 2nd GI Conf. on Automata Theory and Formal Languages, Springer Lec. Notes Comp. Sci., 1975, Vol. 33, pp. 515–532.
R.M. Dudley, “A course on empirical processes,” Lecture Notes in Mathematics, Vol. 1097, pp. 2-142.
R.E. Ellis, “Geometric uncertainties in polyhedral object recognition,” IEEE Tran. Rob. Aut., Vol. 7, No. 3, pp. 361–371, 1991.
Google Scholar
O.D. Faugeras and M. Hebert, “A3Drecognition and positioning algorithm using geometrical matching between primitive surfaces,” 8th Int. Joint Conf. Artificial Intell., 1983, pp. 996–1002.
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” in Proc. of COLT93, ACM Press, 1993, pp. 361–369.
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” Machine Learning, Vol. 18, pp. 131–148, 1995.
Google Scholar
P.G. Gottschalk, J.L. Turney, and T.N. Mudge, “Efficient recognition of partially visible objects using a logarithmic complexity matching technique,” Int. J. Rob. Res., Vol. 8, No. 6, pp. 110–131, 1989.
Google Scholar
W.E.L. Grimson and D.P. Huttenlocher, “On the verification of hypothesized matches in model-based recognition,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-13, No. 12, pp. 1201–1213, 1991.
Google Scholar
W.E.L. Grimson and D.P. Huttenlocher (Eds.), “Special (double) issue on the interpretation of 3D scenes,” IEEE Trans. on Pattern Analysis and Mach. Intel.,Vol. PAMI-13, No. 10 andVol. PAMI-14, No.2, 1991.
W.E.L. Grimson, D.P. Huttenlocher, and D.W. Jacobs, “A study of affine matching with bounded sensor error,” Second Europ. Conf. Comp. Vision, 1992, pp. 291–306.
W.E.L. Grimson and T. Lozano-Perez, “Model based recognition and localization from sparse range or tactile data,” Int. J. Rob. Res., Vol. 3, No. 3, pp. 3–35, 1984.
Google Scholar
D. Haussler, “Decision theoretic generalizations of the PAC model for neural nets and other learning applications,” Information and Computation, Vol. 100, pp. 78–150, 1992.
Google Scholar
D.P. Huttenlocher, G.A. Klanderman, and J. Rucklidge, “Comparing images using the Hasusdorff distance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-15, No. 9, pp. 850–863, 1993.
Google Scholar
M. Kearns and Y. Mansour, “On boosting ability of top-down decision tree learning algorithms,” in Proc. 28th Annual ACM STOC, 1996, pp 459–468.
D. Keren, D. Cooper, and J. Subrahmonia, “Describing complicated objects by implicit polynomials,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-16, pp. 38–53, 1994.
Google Scholar
D.J. Kriegman and J. Ponce, “On recognizing and positioning curved 3D objects from image contours,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-12, pp. 1127–1137, 1990.
Google Scholar
M. Lindenbaum, “On the amount of data required for reliable recognition,” in Proceedings of the 12th International Conference on Pattern Recognition, 1994,Vol. I, pp. 726–729. See also: “An integrated model for evaluating the amount of data required for recognition,” CIS Report 9329, CS Dept. Technion, Israel, July 1995. See also: “An integrated model for evaluating the amount of data required for reliable recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 19, No. 11, pp. pp1251-1264, 1997.
Google Scholar
M. Lindenbaum, “Bounds on shape recognition performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-17, No. 7, pp. 666–680, 1995.
Google Scholar
M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to object recognition,” in Proceedings of the 3rd European Conference on Computer Vision, 1994, pp. 239–240.
M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to 3D object recognition from perspective projections,” in Proceedings of the 12th National Conf. on Artificial Intelligence (AAAI), 1994, pp. 985–990.
J. Matousek, “Epsilon nets and computational geometry,” in New Trends in Discrete and Computational Geometry, J. Pach (Ed.).
S.J. Maybank, “Probabilistic analysis of the application of the cross ratio to model based vision,” International Journal Computer Vision, Vol. 16, pp. 5–33, 1995.
Google Scholar
J. Milnor, “On the Betti numbers of real varieties,” Proc. Amer. Math. Soc., Vol. 15, pp. 275–280, 1964.
Google Scholar
Y. Moses and S. Ullman, “Limitations of non model-based recognition systems,” Proc. ECCV-92, 1992, pp. 820–828.
J.L. Mundy and A.J. Heller, “The evolution and testing of modelbased object recognition systems,” in Proc. 3rd ICCV, 1990, pp. 268–282.
J.L. Mundy and A.P. Zisserman (Eds.), Geometric Invariance in Computer Vision, MIT Press, 1992.
A. Rudshtein and M. Lindenbaum, “Quantifying the performance of feature-based recognition,” Proceedings of the 12th International Conference on Pattern Recognition, 1996, Vol. 1, pp. 35–39.
Google Scholar
N. Sauer, “On the density of family of sets,” Journal of Combinatorial Theory (Series A), Vol. 13, pp. 145–147, 1972.
Google Scholar
J. Serra, Image Analysis and Mathematical Morphology, Academic Press: London, 1982.
Google Scholar
S.S. Skiena, “Problems in geometric probing,” Algorithmica, Vol. 4, pp. 599–605, 1989.
Google Scholar
A. Tannenbaum and Y. Yomdin, “Robotic manipulators and the geometry of real semialgebraic sets,” IEEE Journal on Rob. Aut., Vol. RA-3, pp. 301–307, 1987.
Google Scholar
G. Taubin and D.B. Cooper, “2D and 3D object recognition and positioning with algebraic invariants and covariants,” in Symbolic and Numerical Computation for Artificial Intelligence, B.R. Donald, D. Kapur, and J.L. Mundy (Eds.), 1992.
D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, “Elastically deformable models,” ACM Computer Graphics, Vol. 21, pp. 205–214, 1987.
Google Scholar
V.N. Vapnik and A.Y. Chervonenkis, “On the uniform convergence of relative frequenciesof events to their probabilities,” Theory of Probability and its applications, Vol. 16, No. 2, pp. 264–280, 1971.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Technion Haifa, 32000, Israel
Michael Lindenbaum & Shai Ben-David

Authors

Michael Lindenbaum
View author publications
You can also search for this author in PubMed Google Scholar
Shai Ben-David
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lindenbaum, M., Ben-David, S. VC-Dimension Analysis of Object Recognition Tasks. Journal of Mathematical Imaging and Vision 10, 27–49 (1999). https://doi.org/10.1023/A:1008314532315

Download citation

Issue Date: January 1999
DOI: https://doi.org/10.1023/A:1008314532315

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VC-Dimension Analysis of Object Recognition Tasks

Abstract

Access this article

Similar content being viewed by others

The Geometry of Orthogonal-Series, Square-Root Density Estimators: Applications in Computer Vision and Model Selection

Tractable Algorithms for Robust Model Estimation

Algebraic-Geometric Methods

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

VC-Dimension Analysis of Object Recognition Tasks

Abstract

Access this article

Similar content being viewed by others

The Geometry of Orthogonal-Series, Square-Root Density Estimators: Applications in Computer Vision and Model Selection

Tractable Algorithms for Robust Model Estimation

Algebraic-Geometric Methods

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation