View-Invariant Representation and Recognition of Actions

Rao, Cen; Yilmaz, Alper; Shah, Mubarak

doi:10.1023/A:1020350100748

View-Invariant Representation and Recognition of Actions

Published: November 2002

Volume 50, pages 203–226, (2002)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Cen Rao¹,
Alper Yilmaz¹ &
Mubarak Shah¹

786 Accesses
334 Citations
Explore all metrics

Abstract

Analysis of human perception of motion shows that information for representing the motion is obtained from the dramatic changes in the speed and direction of the trajectory. In this paper, we present a computational representation of human action to capture these dramatic changes using spatio-temporal curvature of 2-D trajectory. This representation is compact, view-invariant, and is capable of explaining an action in terms of meaningful action units called dynamic instants and intervals. A dynamic instant is an instantaneous entity that occurs for only one frame, and represents an important change in the motion characteristics. An interval represents the time period between two dynamic instants during which the motion characteristics do not change. Starting without a model, we use this representation for recognition and incremental learning of human actions. The proposed method can discover instances of the same action performed by differentpeople from different view points. Experiments on 47 actions performed by 7 individuals in an environment with no constraints shows the robustness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bobick, A. and Davis, J.W. 1997. Action recognition using temporal templates. In CVPR-97, pp. 125–146.
Comaniciu, D., Ramesh, V., and Meer, P. 2000. Real-time tracking of non-rigid objects using mean shift. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 142–149.
Google Scholar
Davis, J., Bobick, A., and Richards, W. 2000. Categorical representation and recognition of oscillatory motion patterns. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 628–635.
Gould, K. and Shah, M. 1989. The trajectory primal sketch: A multi-scale scheme for representing motion characteristics. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 79–85.
Izumi, M. and Kojiama, A. 2000. “Generating natural language description of human behavior from video images.” In ICPR-2000, vol. 4, pp. 728–731
Google Scholar
Jagacinski, R.J., Johnson, W.W., and Miller, R.A. 1983. Quantify-ing the cognitive trajectories of extrapolated movements. Journal of Exp. Psychology: Human Perception and Performance, 9: 43–57.
Google Scholar
Kjeldesn, R. and Kender, J. 1996. Finding skin in color images. In Int. Workshop on Automatic Face and Gesture Recognition, pp. 312–317.
Koller, D., Heinze, D., and Nagel, H.-H. 1991. Algorithmic characterization of vehicle trajectories from image sequences by motion verbs. In CVPR-91, pp. 90–95.
Madabushi, A. and Aggarwal, J.K. 2000. Using head movement to recognize activity. In Proc. Int Conf on Pattern Recognition, vol. 4, pp. 698–701.
Google Scholar
Mundy, J.L. and Zisserman, A. 1992. Geometric Invariance in Computer Vision. The MIT Press. ISBN 0-262-13285-0.
Newtson, D. and Engquist, G. 1976. The perceptual organization of ongoing behavior. Journal of Experimental Social Psychology, 12(5):436–450.
Google Scholar
Parish, D.H., Sperling, G., and Landy, M.S. 1990. Intelligent temporal sub-sampling of American sign language using event boundaries. J. Exptl. Psychol.: Human Perception and Performance, 16:282–294.
Google Scholar
Perona. P. and Malik, J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE PAMI, 12(7).
Polana, R. 1994. Temporal texture and activity recognition. Ph.D. Thesis, University of Rochester.
Rosen, K.H. 1999. Discrete Mathematics and its Applications. 4th edn. McGraw-Hill: New York.
Google Scholar
Rubin, J.M. and Richards, W.A. 1985. Boundaries of visual motion. Tech. Rep. AIM-835, Massachusetts Institute of Technology, Artificial Intelligence Laboratory, p. 149.
Seitz, S.M. and Dyer, C.R. 1997. View-invariant analysis of cyclic motion. International Journal of Computer Vision, 25:1–25.
Google Scholar
Shapiro, L.S., Zisserman, A., and Brady, M. 1995. “3D motion recovery via affine epipolar geometry.” Int. J. of Computer Vision, 16:147–182.
Google Scholar
Siskind, J.M. and Moris, Q. 1996. A maximum likelihood ap-proach to visual event classification. In ECCV-96, pp. 347–360.
Starner, T. and Pentland, A. 1996. Real-time American sign language recognition from video using hidden Markov models. In Motion-Based Recognition, M. Shah and R. Jain (Eds.). Kluwer Academic Publishers: Dordrecht. Computational Imaging and Vision Series.
Google Scholar
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: Afactorization method. Int. J. of Computer Vision, 9(2):137–154.
Google Scholar
Tsai, Ping-Sing, Shah, M., Keiter, K., and Kasparis, T. 1994. Cyclic motion detection for motion based recognition. Pattern Recognition, 27(12).
Tsotsos, J.K. et al. 1980. “A framework for visual motion under-standing.” IEEE PAMI, 2(6):563–573.
Google Scholar
Zacks, J. and Tversky, B. 2001. Event structure in perception and cognition. Psychological Bulletin, 127(1):3–21.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Laboratory, School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL, 32816, USA
Cen Rao, Alper Yilmaz & Mubarak Shah

Authors

Cen Rao
View author publications
You can also search for this author in PubMed Google Scholar
Alper Yilmaz
View author publications
You can also search for this author in PubMed Google Scholar
Mubarak Shah
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rao, C., Yilmaz, A. & Shah, M. View-Invariant Representation and Recognition of Actions. International Journal of Computer Vision 50, 203–226 (2002). https://doi.org/10.1023/A:1020350100748

Download citation

Issue Date: November 2002
DOI: https://doi.org/10.1023/A:1020350100748

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

View-Invariant Representation and Recognition of Actions

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

catch22: CAnonical Time-series CHaracteristics

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

View-Invariant Representation and Recognition of Actions

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

catch22: CAnonical Time-series CHaracteristics

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation