Abstract
Computer vision systems, such as “seeing” robots, aimed at functioning robustly in a natural environment rich on information benefit from relying on multiple cues. Then the problem of integrating these become central. Existing approaches to cue integration have typically been based on physical and mathematical models for each cue and used estimation and optimization methods to fuse the parameterizations of these models.
In this paper we consider an approach for fusion that does not rely on the underlying models for each cue. It is based on a simple binary voting scheme. A particular feature of such a scheme is that also incommensurable cues, such as intensity and surface orientation, can be fused in a direct way. Other features are that uncertainties and the normalization of them is avoided. Instead, consensus of several cues is considered as non-accidental and used as support for hypotheses of whatever structure is sought for. It is shown that only a small set of cues need to agree to obtain a reliable output.
We apply the proposed technique to finding instances of planar surfaces in binocular images, without resorting to scene reconstruction or segmentation. The results are of course not comparable to the best results that can be obtained by complete scene reconstruction. However, they provide the most obvious instances of planes also with rather crude assumptions and coarse algorithms. Even though the precise extent of the planar patches is not derived good overall hypotheses are obtained.
Our work applies voting schemes beyond earlier attempts, and also approaches the cue integration problem in a novel manner. Although further research is needed to establish the full applicability of our technique our results so far seem quite useful.
Chapter PDF
References
Bengtsson, A. and Eklundh, J.-O. (1991). Shape representation by multiscale contour approximation, IEEE Trans. Pattern Analysis and Machine Intell. 13: 85–94.
Bloch, I. (1996). Information combination operators for data fusion: A comparative review with classification, IEEE transactions on Systems, Man, And Cybernetics, Part A: systems and humans 26(1): 42–52.
Bräutigam, C., Gårding, J. and Eklundh, J.-O. (1996). Seeing the obvious, Proc. 13th International Conference on Pattern Recognition, Vol. I, IEEE Computer Society Press, Vienna, Austria, pp. 67–72.
Bülthoff, H. and Mallot, H. (1987). Interaction of different modules in depth perception, Proceedings of the First International Conference on Computer Vision, pp. 295–305.
Burt, P. and Julesz, B. (1980). Modifications of the classical notion of panum's fusional area, Perception 9: 671–682.
Canny, J. (1986). A computational approach to edge detection, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-8(6): 679–698.
Clark, J. and Yuille, A. (1990). Data fusion for sensory information processing systems, Kluwer, Boston, Mass.
Gårding, J. (1993). Shape from texture and contour by weak isotropy, J. of Artificial Intelligence 64(2): 243–297. Also ISRN KTH/NA/P—90/12—SE. Shortened version in Proc. 10th International Conference on Pattern Recognition.
Gårding, J. and Lindeberg, T. (1994). Direct estimation of local surface shape in a fixating binocular vision system, in J.-O. Eklundh (ed.), Proc. 3rd European Conference on Computer Vision, Vol. 800 of Lecture Notes in Computer Science, Springer Verlag, Berlin, Stockholm, Sweden, pp. 365–376.
Gårding, J. and Lindeberg, T. (1996). Direct computation of shape cues using scale-adapted spatial derivative operators, International Journal of Computer Vision 17(2): 163–191.
INRIA-Syntim (1994, 1995, 1996). Stereo images. *http://www-syntim.inria.fr/syntim/analyse/paires-eng.html
Jones, D. and Malik, J. (1992). Determining three-dimensional shape from orientation and spatial frequency disparities, in G. Sandini (ed.), Proc. 2nd European Conference on Computer Vision, Vol. 588 of Lecture Notes in Computer Science, Springer Verlag, Berlin, pp. 661–669.
Lam, L. and Suen, C. Y. (1997). Application of majority voting to pattern recognition: An analysis of its behaviour and performance, IEEE Trans. on Systems, Man And Cybernetics, Part A: Systems and Humans 27(5): 553–568.
Li, M. (1989). Hierarchical Multi-point Matching with Simultaneous Detection and Location of Breaklines, PhD thesis, Royal Institute of Technology, Department of Photogrammetry, S-100 44 Stockholm, Sweden.
Lindeberg, T. (1995). Direct estimation of affine deformations of brightness patterns using visual front-end operators with automatic scale selection, Proc. 5th International Conference on Computer Vision, Cambridge, MA, pp. 134–141.
Lindeberg, T. and Gårding, J. (1993). Shape from texture from a multi-scale perspective, Proc. 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pp. 683–691.
Malik, J. (1987). Interpreting line drawings of curved objects, International Journal of Computer Vision pp. 73–104.
Mundy, J. and Zisserman, A. (1992). Geometric Invariance in Computer Vision, MIT Press, Boston, MA.
Parhami, B. (1994). Voting algorithms, IEEE Transactions on reliability 43(4): 617–629.
Pollard, S., Mayhew, J. and Frisby, J. (1985). PMF: A stereo correspondence algorithm using a disparity gradient limit, Perception 14: 449–470.
Shakunaga, T. and Kaneko, H. (1988). Shape from angles under perspective projection, Proc. 2nd International Conference on Computer Vision.
Sugihara, K. (1986). Machine Interpretation of Line Drawings, MIT Press, Cambridge, MA.
Tyler, C. (1973). Stereoscopic vision: cortical limitations and a disparity scaling effect, Science 181: 276–278.
Weiss, I. (1988). Projective invariants of shape, Proc. IEEE Conf. Computer Vision and Pattern Recognition, Vol. CVPR88, Ann Arbor, Michigan, June5–9, pp. 291–297.
Wildes, R. (1991). Direct recovery of three-dimensional scene geometry from binocular stereo disparity, IEEE Trans. Pattern Analysis and Machine Intell. 13(8): 761–774.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bräutigam, C.G., Eklundh, J.O., Christensen, H.I. (1998). A model-free voting approach for integrating multiple cues. In: Burkhardt, H., Neumann, B. (eds) Computer Vision — ECCV'98. ECCV 1998. Lecture Notes in Computer Science, vol 1406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0055701
Download citation
DOI: https://doi.org/10.1007/BFb0055701
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64569-6
Online ISBN: 978-3-540-69354-3
eBook Packages: Springer Book Archive