Skip to main content
Log in

VC-Dimension Analysis of Object Recognition Tasks

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects.

We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library.

Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. H.S. Baird, Model-Based Image Matching using Location, MIT Press: Cambridge, MA, 1985.

    Google Scholar 

  2. S. Ben-David, G. Benedek, and Y. Mansour, “A parametrization scheme for classifying models of learnability,” Information and Computation, Vol. 120, No. 1, pp. 11–21, 1995.

    Google Scholar 

  3. S. Ben-David and M. Lindenbaum, “Localization vs. identification of semi-algebraic sets,” in Proceedings of the 6th ACM Conference on Computational Learning Theory, 1993, pp. 327–336. See also: “Localization vs. identification of semi-algebraic sets,” Machine Learning, Vol. 32, pp. 207-224, 1998.

  4. S. Ben-David and M. Lindenbaum, “Learning distributions by their density levels–A paradigm for learning without a teacher,” in Proc. of the Second European Conference on Computational Learning Theory, 1995. See also: “Learning distributions by their density levels–A paradigm for learning without a teacher,” Journal of Computer and System Science, Vol. 55, No. 1, pp. 171-181, 1987.

  5. A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth, “Learnability and the Vapnik-Chervonenkis dimension,” JACM, Vol. 36, No. 4, pp. 929–965, 1989.

    Google Scholar 

  6. J.B. Burns, R.S. Weiss, and E.M. Riseman, “The non-existence of general-case view-invariants,” in Geometric Invariance in Computer Vision, J.L Mundy and A.P. Zisserman (Eds.), MIT Press, 1992.

  7. A.C. Cass, “Feature matching for object localization in the presence of uncertainty,” in Proc. 3rd Int. Conf. on Comp.Vis., Osaka, 1991, pp. 360–364.

  8. D. Clemens and D. Jacobs, “Space and time bounds on indexing 3D models from 2D images,” IEEE Trans. on Pattern Analysis and Machine Intelligence,Vol. 13, No. 10, pp. 1007–1017, 1991.

    Google Scholar 

  9. G.E. Collins, “Quantifier elimination for real closed fields by cylindrical algebraic decomposition,” in Proceedings of the 2nd GI Conf. on Automata Theory and Formal Languages, Springer Lec. Notes Comp. Sci., 1975, Vol. 33, pp. 515–532.

  10. R.M. Dudley, “A course on empirical processes,” Lecture Notes in Mathematics, Vol. 1097, pp. 2-142.

  11. R.E. Ellis, “Geometric uncertainties in polyhedral object recognition,” IEEE Tran. Rob. Aut., Vol. 7, No. 3, pp. 361–371, 1991.

    Google Scholar 

  12. O.D. Faugeras and M. Hebert, “A3Drecognition and positioning algorithm using geometrical matching between primitive surfaces,” 8th Int. Joint Conf. Artificial Intell., 1983, pp. 996–1002.

  13. P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” in Proc. of COLT93, ACM Press, 1993, pp. 361–369.

  14. P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” Machine Learning, Vol. 18, pp. 131–148, 1995.

    Google Scholar 

  15. P.G. Gottschalk, J.L. Turney, and T.N. Mudge, “Efficient recognition of partially visible objects using a logarithmic complexity matching technique,” Int. J. Rob. Res., Vol. 8, No. 6, pp. 110–131, 1989.

    Google Scholar 

  16. W.E.L. Grimson and D.P. Huttenlocher, “On the verification of hypothesized matches in model-based recognition,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-13, No. 12, pp. 1201–1213, 1991.

    Google Scholar 

  17. W.E.L. Grimson and D.P. Huttenlocher (Eds.), “Special (double) issue on the interpretation of 3D scenes,” IEEE Trans. on Pattern Analysis and Mach. Intel.,Vol. PAMI-13, No. 10 andVol. PAMI-14, No.2, 1991.

  18. W.E.L. Grimson, D.P. Huttenlocher, and D.W. Jacobs, “A study of affine matching with bounded sensor error,” Second Europ. Conf. Comp. Vision, 1992, pp. 291–306.

  19. W.E.L. Grimson and T. Lozano-Perez, “Model based recognition and localization from sparse range or tactile data,” Int. J. Rob. Res., Vol. 3, No. 3, pp. 3–35, 1984.

    Google Scholar 

  20. D. Haussler, “Decision theoretic generalizations of the PAC model for neural nets and other learning applications,” Information and Computation, Vol. 100, pp. 78–150, 1992.

    Google Scholar 

  21. D.P. Huttenlocher, G.A. Klanderman, and J. Rucklidge, “Comparing images using the Hasusdorff distance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-15, No. 9, pp. 850–863, 1993.

    Google Scholar 

  22. M. Kearns and Y. Mansour, “On boosting ability of top-down decision tree learning algorithms,” in Proc. 28th Annual ACM STOC, 1996, pp 459–468.

  23. D. Keren, D. Cooper, and J. Subrahmonia, “Describing complicated objects by implicit polynomials,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-16, pp. 38–53, 1994.

    Google Scholar 

  24. D.J. Kriegman and J. Ponce, “On recognizing and positioning curved 3D objects from image contours,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-12, pp. 1127–1137, 1990.

    Google Scholar 

  25. M. Lindenbaum, “On the amount of data required for reliable recognition,” in Proceedings of the 12th International Conference on Pattern Recognition, 1994,Vol. I, pp. 726–729. See also: “An integrated model for evaluating the amount of data required for recognition,” CIS Report 9329, CS Dept. Technion, Israel, July 1995. See also: “An integrated model for evaluating the amount of data required for reliable recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 19, No. 11, pp. pp1251-1264, 1997.

    Google Scholar 

  26. M. Lindenbaum, “Bounds on shape recognition performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-17, No. 7, pp. 666–680, 1995.

    Google Scholar 

  27. M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to object recognition,” in Proceedings of the 3rd European Conference on Computer Vision, 1994, pp. 239–240.

  28. M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to 3D object recognition from perspective projections,” in Proceedings of the 12th National Conf. on Artificial Intelligence (AAAI), 1994, pp. 985–990.

  29. J. Matousek, “Epsilon nets and computational geometry,” in New Trends in Discrete and Computational Geometry, J. Pach (Ed.).

  30. S.J. Maybank, “Probabilistic analysis of the application of the cross ratio to model based vision,” International Journal Computer Vision, Vol. 16, pp. 5–33, 1995.

    Google Scholar 

  31. J. Milnor, “On the Betti numbers of real varieties,” Proc. Amer. Math. Soc., Vol. 15, pp. 275–280, 1964.

    Google Scholar 

  32. Y. Moses and S. Ullman, “Limitations of non model-based recognition systems,” Proc. ECCV-92, 1992, pp. 820–828.

  33. J.L. Mundy and A.J. Heller, “The evolution and testing of modelbased object recognition systems,” in Proc. 3rd ICCV, 1990, pp. 268–282.

  34. J.L. Mundy and A.P. Zisserman (Eds.), Geometric Invariance in Computer Vision, MIT Press, 1992.

  35. A. Rudshtein and M. Lindenbaum, “Quantifying the performance of feature-based recognition,” Proceedings of the 12th International Conference on Pattern Recognition, 1996, Vol. 1, pp. 35–39.

    Google Scholar 

  36. N. Sauer, “On the density of family of sets,” Journal of Combinatorial Theory (Series A), Vol. 13, pp. 145–147, 1972.

    Google Scholar 

  37. J. Serra, Image Analysis and Mathematical Morphology, Academic Press: London, 1982.

    Google Scholar 

  38. S.S. Skiena, “Problems in geometric probing,” Algorithmica, Vol. 4, pp. 599–605, 1989.

    Google Scholar 

  39. A. Tannenbaum and Y. Yomdin, “Robotic manipulators and the geometry of real semialgebraic sets,” IEEE Journal on Rob. Aut., Vol. RA-3, pp. 301–307, 1987.

    Google Scholar 

  40. G. Taubin and D.B. Cooper, “2D and 3D object recognition and positioning with algebraic invariants and covariants,” in Symbolic and Numerical Computation for Artificial Intelligence, B.R. Donald, D. Kapur, and J.L. Mundy (Eds.), 1992.

  41. D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, “Elastically deformable models,” ACM Computer Graphics, Vol. 21, pp. 205–214, 1987.

    Google Scholar 

  42. V.N. Vapnik and A.Y. Chervonenkis, “On the uniform convergence of relative frequenciesof events to their probabilities,” Theory of Probability and its applications, Vol. 16, No. 2, pp. 264–280, 1971.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lindenbaum, M., Ben-David, S. VC-Dimension Analysis of Object Recognition Tasks. Journal of Mathematical Imaging and Vision 10, 27–49 (1999). https://doi.org/10.1023/A:1008314532315

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008314532315

Navigation