Abstract
We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects.
We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library.
Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.
Similar content being viewed by others
References
H.S. Baird, Model-Based Image Matching using Location, MIT Press: Cambridge, MA, 1985.
S. Ben-David, G. Benedek, and Y. Mansour, “A parametrization scheme for classifying models of learnability,” Information and Computation, Vol. 120, No. 1, pp. 11–21, 1995.
S. Ben-David and M. Lindenbaum, “Localization vs. identification of semi-algebraic sets,” in Proceedings of the 6th ACM Conference on Computational Learning Theory, 1993, pp. 327–336. See also: “Localization vs. identification of semi-algebraic sets,” Machine Learning, Vol. 32, pp. 207-224, 1998.
S. Ben-David and M. Lindenbaum, “Learning distributions by their density levels–A paradigm for learning without a teacher,” in Proc. of the Second European Conference on Computational Learning Theory, 1995. See also: “Learning distributions by their density levels–A paradigm for learning without a teacher,” Journal of Computer and System Science, Vol. 55, No. 1, pp. 171-181, 1987.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth, “Learnability and the Vapnik-Chervonenkis dimension,” JACM, Vol. 36, No. 4, pp. 929–965, 1989.
J.B. Burns, R.S. Weiss, and E.M. Riseman, “The non-existence of general-case view-invariants,” in Geometric Invariance in Computer Vision, J.L Mundy and A.P. Zisserman (Eds.), MIT Press, 1992.
A.C. Cass, “Feature matching for object localization in the presence of uncertainty,” in Proc. 3rd Int. Conf. on Comp.Vis., Osaka, 1991, pp. 360–364.
D. Clemens and D. Jacobs, “Space and time bounds on indexing 3D models from 2D images,” IEEE Trans. on Pattern Analysis and Machine Intelligence,Vol. 13, No. 10, pp. 1007–1017, 1991.
G.E. Collins, “Quantifier elimination for real closed fields by cylindrical algebraic decomposition,” in Proceedings of the 2nd GI Conf. on Automata Theory and Formal Languages, Springer Lec. Notes Comp. Sci., 1975, Vol. 33, pp. 515–532.
R.M. Dudley, “A course on empirical processes,” Lecture Notes in Mathematics, Vol. 1097, pp. 2-142.
R.E. Ellis, “Geometric uncertainties in polyhedral object recognition,” IEEE Tran. Rob. Aut., Vol. 7, No. 3, pp. 361–371, 1991.
O.D. Faugeras and M. Hebert, “A3Drecognition and positioning algorithm using geometrical matching between primitive surfaces,” 8th Int. Joint Conf. Artificial Intell., 1983, pp. 996–1002.
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” in Proc. of COLT93, ACM Press, 1993, pp. 361–369.
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” Machine Learning, Vol. 18, pp. 131–148, 1995.
P.G. Gottschalk, J.L. Turney, and T.N. Mudge, “Efficient recognition of partially visible objects using a logarithmic complexity matching technique,” Int. J. Rob. Res., Vol. 8, No. 6, pp. 110–131, 1989.
W.E.L. Grimson and D.P. Huttenlocher, “On the verification of hypothesized matches in model-based recognition,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-13, No. 12, pp. 1201–1213, 1991.
W.E.L. Grimson and D.P. Huttenlocher (Eds.), “Special (double) issue on the interpretation of 3D scenes,” IEEE Trans. on Pattern Analysis and Mach. Intel.,Vol. PAMI-13, No. 10 andVol. PAMI-14, No.2, 1991.
W.E.L. Grimson, D.P. Huttenlocher, and D.W. Jacobs, “A study of affine matching with bounded sensor error,” Second Europ. Conf. Comp. Vision, 1992, pp. 291–306.
W.E.L. Grimson and T. Lozano-Perez, “Model based recognition and localization from sparse range or tactile data,” Int. J. Rob. Res., Vol. 3, No. 3, pp. 3–35, 1984.
D. Haussler, “Decision theoretic generalizations of the PAC model for neural nets and other learning applications,” Information and Computation, Vol. 100, pp. 78–150, 1992.
D.P. Huttenlocher, G.A. Klanderman, and J. Rucklidge, “Comparing images using the Hasusdorff distance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-15, No. 9, pp. 850–863, 1993.
M. Kearns and Y. Mansour, “On boosting ability of top-down decision tree learning algorithms,” in Proc. 28th Annual ACM STOC, 1996, pp 459–468.
D. Keren, D. Cooper, and J. Subrahmonia, “Describing complicated objects by implicit polynomials,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-16, pp. 38–53, 1994.
D.J. Kriegman and J. Ponce, “On recognizing and positioning curved 3D objects from image contours,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-12, pp. 1127–1137, 1990.
M. Lindenbaum, “On the amount of data required for reliable recognition,” in Proceedings of the 12th International Conference on Pattern Recognition, 1994,Vol. I, pp. 726–729. See also: “An integrated model for evaluating the amount of data required for recognition,” CIS Report 9329, CS Dept. Technion, Israel, July 1995. See also: “An integrated model for evaluating the amount of data required for reliable recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 19, No. 11, pp. pp1251-1264, 1997.
M. Lindenbaum, “Bounds on shape recognition performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-17, No. 7, pp. 666–680, 1995.
M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to object recognition,” in Proceedings of the 3rd European Conference on Computer Vision, 1994, pp. 239–240.
M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to 3D object recognition from perspective projections,” in Proceedings of the 12th National Conf. on Artificial Intelligence (AAAI), 1994, pp. 985–990.
J. Matousek, “Epsilon nets and computational geometry,” in New Trends in Discrete and Computational Geometry, J. Pach (Ed.).
S.J. Maybank, “Probabilistic analysis of the application of the cross ratio to model based vision,” International Journal Computer Vision, Vol. 16, pp. 5–33, 1995.
J. Milnor, “On the Betti numbers of real varieties,” Proc. Amer. Math. Soc., Vol. 15, pp. 275–280, 1964.
Y. Moses and S. Ullman, “Limitations of non model-based recognition systems,” Proc. ECCV-92, 1992, pp. 820–828.
J.L. Mundy and A.J. Heller, “The evolution and testing of modelbased object recognition systems,” in Proc. 3rd ICCV, 1990, pp. 268–282.
J.L. Mundy and A.P. Zisserman (Eds.), Geometric Invariance in Computer Vision, MIT Press, 1992.
A. Rudshtein and M. Lindenbaum, “Quantifying the performance of feature-based recognition,” Proceedings of the 12th International Conference on Pattern Recognition, 1996, Vol. 1, pp. 35–39.
N. Sauer, “On the density of family of sets,” Journal of Combinatorial Theory (Series A), Vol. 13, pp. 145–147, 1972.
J. Serra, Image Analysis and Mathematical Morphology, Academic Press: London, 1982.
S.S. Skiena, “Problems in geometric probing,” Algorithmica, Vol. 4, pp. 599–605, 1989.
A. Tannenbaum and Y. Yomdin, “Robotic manipulators and the geometry of real semialgebraic sets,” IEEE Journal on Rob. Aut., Vol. RA-3, pp. 301–307, 1987.
G. Taubin and D.B. Cooper, “2D and 3D object recognition and positioning with algebraic invariants and covariants,” in Symbolic and Numerical Computation for Artificial Intelligence, B.R. Donald, D. Kapur, and J.L. Mundy (Eds.), 1992.
D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, “Elastically deformable models,” ACM Computer Graphics, Vol. 21, pp. 205–214, 1987.
V.N. Vapnik and A.Y. Chervonenkis, “On the uniform convergence of relative frequenciesof events to their probabilities,” Theory of Probability and its applications, Vol. 16, No. 2, pp. 264–280, 1971.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lindenbaum, M., Ben-David, S. VC-Dimension Analysis of Object Recognition Tasks. Journal of Mathematical Imaging and Vision 10, 27–49 (1999). https://doi.org/10.1023/A:1008314532315
Issue Date:
DOI: https://doi.org/10.1023/A:1008314532315