Abstract
This paper addresses the problem of human motion tracking from multiple image sequences. The human body is described by five articulated mechanical chains and human body-parts are described by volumetric primitives with curved surfaces. If such a surface is observed with a camera, an extremal contour appears in the image whenever the surface turns smoothly away from the viewer. We describe a method that recovers human motion through a kinematic parameterization of these extremal contours. The method exploits the fact that the observed image motion of these contours is a function of both the rigid displacement of the surface and of the relative position and orientation between the viewer and the curved surface. First, we describe a parameterization of an extremal-contour point velocity for the case of developable surfaces. Second, we use the zero-reference kinematic representation and we derive an explicit formula that links extremal contour velocities to the angular velocities associated with the kinematic model. Third, we show how the chamfer-distance may be used to measure the discrepancy between predicted extremal contours and observed image contours; moreover we show how the chamfer distance can be used as a differentiable multi-valued function and how the tracker based on this distance can be cast into a continuous non-linear optimization framework. Fourth, we describe implementation issues associated with a practical human-body tracker that may use an arbitrary number of cameras. One great methodological and practical advantage of our method is that it relies neither on model-to-image, nor on image-to-image point matches. In practice we model people with 5 kinematic chains, 19 volumetric primitives, and 54 degrees of freedom; We observe silhouettes in images gathered with several synchronized and calibrated cameras. The tracker has been successfully applied to several complex motions gathered at 30 frames/second.
Similar content being viewed by others
References
Agarwal, A., & Triggs, W. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis & Machine Intelligence, 28(1), 44–58.
Balan, A. O., Sigal, L., & Black, M. J. (2005). A quantitative evaluation of video-based 3D person tracking. In PETS’05 (pp. 349–356).
Barrow, H. G., & Tenenbaum, J. M. (1981). Interpreting line drawings as three-dimensional surfaces. Artificial Intelligence, 17(1–3), 75–116.
Borgefors, G. (1986). Distance transformation in digital images. Computer Vision, Graphics, and Image Processing, 34(3), 344–371.
Bregler, C., Malik, J., & Pullen, K. (2004). Twist based acquisition and tracking of animal and human kinematics. International Journal of Computer Vision, 56(3), 179–194.
Cheung, K. M., Baker, S., & Kanade, T. (2005a). Shape-from-silhouette across time, part I: theory and algorithms. International Journal of Computer Vision, 62(3), 221–247.
Cheung, K. M., Baker, S., & Kanade, T. (2005b). Shape-from-silhouette across time, part II: applications to human modeling and markerless motion tracking. International Journal of Computer Vision, 63(3), 225–245.
David, P., DeMenthon, D. F., Duraiswami, R., & Samet, H. (2004). Softposit: simultaneous pose and correspondence determination. International Journal of Computer Vision, 59(3), 259–284.
Delamarre, Q., & Faugeras, O. (2001). 3D articulated models and multi-view tracking with physical forces. Computer Vision and Image Understanding, 81(3), 328–357.
Deutscher, J., Blake, A., & Reid, I. (2000). Articulated body motion capture by annealed particle filtering. In Computer vision and pattern recognition (pp. 2126–2133).
Do Carmo, M. P. (1976). Differential geometry of curves and surfaces. New York: Prentice-Hall.
Drummond, T., & Cipolla, R. (2001). Real-time tracking of highly articulated structures in the presence of noisy measurements. In ICCV (pp. 315–320).
Felzenswalb, P., & Huttenlocher, D. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.
Forsyth, D. A., & Ponce, J. (2003). Computer vision—a modern approach. New Jersey: Prentice Hall.
Forsyth, D. A., Arikan, O., Ikemoto, L., O’Brien, J., & Ramanan, D. (2006). Computational studies of human motion, part 1: tracking and motion synthesis. Foundations and Trends in Computer Graphics and Vision, 1(2), 77–254.
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.
Gavrila, D. M. (1999). The visual analysis of human movement: a survey. Computer Vision and Image Understanding, 73(1), 82–98.
Gavrila, D. M., & Davis, L. S. (1996). 3D model-based tracking of humans in action: a multi-view approach. In Conference on computer vision and pattern recognition (pp. 73–80), San Francisco, CA.
Gavrila, D. M., & Philomin, V. (1999). Real-time object detection for smart vehicles. In IEEE Proceedings of the seventh international conference on computer vision (pp. 87–93), Kerkyra, Greece.
Gleicher, G., & Ferrier, N. (2002). Evaluating video-based motion capture. In Proceedings of the computer animation 2002 (pp. 75–80), Geneva, Switzerland, June 2002.
Huttenlocher, D. P., Klanderman, G. A., & Rucklidge, W. J. (1993). Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.
Kakadiaris, I., & Metaxas, D. (2000). Model-based estimation of 3D human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1453–1459.
Kehl, R., & Van Gool, L. J. (2006). Markerless tracking of complex human motions from multiple views. Computer Vision and Image Understanding, 103(23), 190–209.
Knossow, D., Ronfard, R., Horaud, R., & Devernay, F. (2006). Tracking with the kinematics of extremal contours. In Lecture notes in computer science. Computer vision—ACCV 2006 (pp. 664–673), Hyderabad, India, January 2006. Berlin: Springer.
Koenderink, J. (1990). Solid shape. Cambridge: The MIT Press.
Kreyzig, E. (1991). Differential geometry. New York: Dover. Reprint of a U. of Toronto 1963 edition.
Martin, F., & Horaud, R. (2002). Multiple camera tracking of rigid objects. International Journal of Robotics Research, 21(2), 97–113.
McCarthy, J. M. (1990). Introduction to theoretical kinematics. Cambridge: MIT Press.
Mikic, I., Trivedi, M. M., Hunter, E., & Cosman, P. C. (2003). Human body model acquisition and tracking using voxel data. International Journal of Computer Vision, 53(3), 199–223.
Moeslund, T. B., Hilton, A., & Krüger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2), 90–126.
Mooring, B. W., Roth, Z. S., & Driels, M. R. (1991). Fundamentals of manipulator calibration. New York: Wiley.
Murray, R. M., Li, Z., & Sastry, S. S. (1994). A mathematical introduction to robotic manipulation. Ann Arbor: CRC Press.
Plaenkers, R., & Fua, P. (2003). Articulated soft objects for multi-view shape and motion capture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1182–1187.
Ronfard, R., Schmid, C., & Triggs, W. (2002). Learning to parse pictures of people. In Proceedings of the 7th European conference on computer vision (Vol. 4, pp. 700–714), Copenhagen, Denmark, June 2002. Berlin: Springer.
Sigal, L., & Black, M. J. (2006). Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion (Technical Report CS-06-08). Department of Computer Science, Brown University, Providence, RI 02912, September 2006.
Sim, D. G., Kwon, O. K., & Park, R. H. (1999). Object matching algorithms using robust Hausdorff distance measures. IEEE Transactions on Image Processing, 8(3), 425–429.
Sminchisescu, C., & Triggs, W. (2003). Kinematic jump processes for monocular 3D human tracking. In International conference on computer vision and pattern recognition (Vol. I, pp. 69–76), June 2003.
Sminchisescu, C., & Triggs, W. (2005). Building roadmaps of minima and transitions in visual models. International Journal of Computer Vision, 61(1), 81–101.
Song, Y., Goncalves, L., & Perona, P. (2003). Unsupervised learning of human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 814–827.
Toyama, K., & Blake, A. (2002). Probabilistic tracking with exemplars in a metric space. International Journal of Computer Vision, 48(1), 9–19.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Knossow, D., Ronfard, R. & Horaud, R. Human Motion Tracking with a Kinematic Parameterization of Extremal Contours. Int J Comput Vis 79, 247–269 (2008). https://doi.org/10.1007/s11263-007-0116-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0116-2