Extraction and temporal segmentation of multiple motion trajectories in human motion

doi:10.1016/j.imavis.2008.03.006

Image and Vision Computing

Volume 26, Issue 12, 1 December 2008, Pages 1621-1635

https://doi.org/10.1016/j.imavis.2008.03.006 Get rights and content

Abstract

A new method for extraction and temporal segmentation of multiple motion trajectories in human motion is presented. The proposed method extracts motion trajectories generated by body parts without any initialization or any assumption on color distribution. Motion trajectories are very compact and representative features for activity recognition. Tracking human body parts (hands and feet) is inherently difficult because the body parts which generate most of the motion trajectories are relatively small compared to the human body. This problem is overcome by using a new motion segmentation method: at every frame, candidate motion locations are detected and set as significant motion points (SMPs). The motion trajectories are obtained by combining these SMPs and the color–optical flow based tracker results. These motion trajectories are inturn used as features for temporal segmentation of specific activities from continuous video sequences. The proposed approach is tested on actual ballet step sequences. Experimental results show that the proposed method can successfully extract and temporally segment multiple motion trajectories from human motion.

Introduction

As more and more video data are available everywhere, there is growing interest in video indexing and classification techniques. Human activities are very good cues for indexing and classification in most of video sequences. Fig. 1 shows some examples of human movements. As can be seen in these pictures, in most cases, human activities can be described by the motion trajectories generated from body parts suggesting that motion trajectories could potentially be used as features for activity recognition.

This paper presents a new method for extracting motion trajectories from human motions and also shows how to extract temporal segmentation of specific activities from continuous video sequences.

Motion trajectories have several advantages over other features such as intensity [31], [32], silhouettes [27], [28], [29], and contours [33]. Motion trajectories are very compact as each motion is represented by a pixel location with correspondences between two subsequent frames. Since motion trajectories explicitly specify the movements from the body parts, they are very representative and smooth. Finally, motion trajectories are separable, for they are generated from different body parts separately (e.g., in the first sequence in Fig. 1, motion trajectories are generated from the left hand to the right hand and then to the right foot).

In this work, motion trajectories are used as features for achieving temporal segmentation of specific human activities from continuous video sequences. In most of the available video sequences, a large number of video segments with different contents are included in a continuous fashion. Therefore, the temporal segmentation of specific contents is a very critical task in video indexing systems.

To extract motion trajectories without any initialization, dominant motions should be detected and tracked. The dominant motion inturn is extracted from articulated motions, i.e., when the whole arm is moving in a hand gesture, the hand which generates the dominant motion will be detected. Previous motion detection algorithms, such as the ones described in [34], [35], [36], [37] and motion segmentation algorithms described in [38], [39], [42] do not use articulated motions, which are not feasible for our purposes.

In many cases of human activities such as dance and sports typically, the whole body movements are captured in video. In these sequences, body parts (hands and feet) generating the motion trajectories appear as small regions, which makes it very difficult to get enough information of the parts to model their shape or their color distribution. Fig. 2 shows the results of the Kernel based object tracking [15] which maximizes the likelihood of the color distribution. For both cases of the hand and the foot tracking, the tracker failed after several frames. Similarity of color distributions between hands and arms (feet and legs) also contributes toward failures.

To overcome the problems described above we propose a new motion segmentation method using mode seeking on the optical flow magnitude to find dominant motion blobs in each frame. The primary significant motion point (SMP) in each motion blob is obtained as a by-product of the motion segmentation algorithm. After the SMPs (Fig. 3) are obtained in every frame, they are used as candidate locations of trajectories, making the tracking procedure possible without any initialization. In each frame, the SMPs are either connected to continuing tracks (trajectories) or new tracks are started from these points. To make the tracking procedure more robust and reliable, our color–optical flow based tracker is applied to each continuing track. This tracker calculates the displacement (tracking result) of continuing tracks in the current frame. This displacement and the SMPs are then used as candidate locations in the current frame, after which the best matches between the continuing tracks and the candidate locations are found by optimizing a cost function.

The multiple motion trajectories obtained by the approach described above are used for temporal segmentation of activities. For each time instance, the optimal alignment between the trajectories from test sequence and the model trajectories is found. A dissimilarity score is then calculated using the dynamic time warping (DTW) algorithm. The algorithm estimates the start and the end point of each activity. Our approach provides temporal segmentation of separate as well as combination of body part trajectories (movements).

This paper is organized as follows. Section 2 presents a brief review on extraction of motion trajectories and activity recognition based on motion trajectories. In Sections 3 Extraction of motion trajectories, 4 Temporal segmentation of motion trajectories, the proposed algorithms for extraction of motion trajectories and temporal segmentation are described in detail. Section 5 discusses the data and results of our experiments. Finally, we present our conclusions in Section 6.

Section snippets

Previous work

A number of attempts have been made to obtain motion trajectories by solving the motion correspondence problem. One of the best known statistical approach is the multiple hypothesis tracker (MHT) [1]. The MHT attempts to solve the data association problem by finding the best hypothesis when each hypothesis represents an assignment of measurements to features. There have been efforts to make the MHT more practical by restraining the number of hypothesis [2], [3]. Another statistical approach is

Extraction of motion trajectories

An overview of our approach to extract motion trajectories is shown in Fig. 4. The input for this extraction procedure is a video sequence. The results from SMP detection and the color–optical flow based tracker are fused to generate more reliable motion trajectories.

Temporal segmentation of motion trajectories

The trajectories established by the method described in the previous section are used for temporal segmentation of specific activities from continuous video sequences.

Our algorithm uses the DTW (dynamic time warping) to deal with temporal variance of two different segments from motion trajectories. The DTW finds the optimal alignment of two temporal signals using similarity score at each time instance. More details on the DTW can be found in [40], [41]. For two temporal signals, A and B, the

Description of data

Actual ballet sequences from commercial video (American Ballet Theater) and ballet sequences captured on digital camcorder were tested for extraction of motion trajectories. For the experiment of temporal segmentation, each ballet movement was captured using a digital camcorder mounted on a stand for both training and testing sequences. In the video sequences used for the experiment, three dancers performed each ballet step more than six times. The sequences were captured with several different

Conclusions

A new approach for extraction and temporal segmentation of motions trajectories is presented. Multiple motion trajectories from human motions are extracted without any initialization and any assumption about color distribution. Separate activity recognition for hands and feet is possible, which is not feasible using other features. Temporal segmentation of hand and leg movements provides more accurate interpretation of the whole body movements. Experimental results show that the motion

Acknowledgments

The authors thank Padmanabhan Soundararajan who provided many valuable comments which helped to significantly improve the presentation of this paper. The authors also thank Michael J. Black for making his optical flow computation code available. A portion of paper has appeared in [43], [44], [45].

References (45)

K. Rangarajan et al.
Establishing motion correspondence
CVGIP: Image Understanding
(1991)
M.J. Black et al.
The robust estimation of multiple motions: parametric and piecewise-smooth flow fields
Computer Vision and Image Understanding
(1996)
D.B. Reid
An algorithm for tracking multiple targets
IEEE Transactions on Automatic Control
(1979)
I.J. Cox et al.
An efficient implementation of Reid’s multiple hypothesis tracking algorithm and its evaluation for the purpose of visual tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
(1996)
I.J. Cox et al.
A comparison of two algorithms for determining ranked assignments with application to multitarget tracking and motion correspondence
IEEE Transactions on Aerospace Electronic Systems
(1997)
M. Shah et al.
Motion trajectories
IEEE Transactions on Systems, Man, and Cybernetics
(1993)
C. Rao et al.
View-invariant representation and recognition of actions
International Journal of Computer Vision
(2002)
C. Stauffer et al.
Learning patterns of activity using real-time tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2000)
C. Rasmussen et al.
Probabilistic data association methods for tracking complex visual objects
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2001)
C.J. Veenman et al.
Resolving motion correspondence for densely moving points
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2001)

A.D. Wilson et al.

Parametric Hidden Markov Models for gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1999)

M.-H. Yang et al.

Extraction of 2D motion trajectories and its application to hand gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2002)

H.-K. Lee et al.

An HMM-based threshold model approach for gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1999)

Y. Cheng

Mean shift, mode seeking, and clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1995)

P. Meer et al.

Mean shift: a robust approach toward feature space analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2002)

D. Comaniciu et al.

Kernel-based object tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2003)

G.R. Bradski, Computer vision face tracking for use in a perceptual user interface, Intel Technology Journal, 2nd...

C. Rao, A. Gritai, M. Shah, View-invariant alignment and matching of video sequences, in: Proceedings of the IEEE...

M.J. Black, A.D.Jepson, A probabilistic framework for matching temporal trajectories: condensation-based recognition of...

A.F. Bobick et al.

A state-based approach to the representation and recognition of gestures

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1997)

J. Barron et al.

Performance of optical flow techniques

International Journal of Computer Vision

(1994)

P. Anandan

A computational framework and an algorithm for the measurement of visual motion

International Journal of Computer Vision

(1989)

Cited by (9)

The retrieval of motion event by associations of temporal frequent pattern growth
2013, Future Generation Computer Systems
Citation Excerpt :
T. D’Orazio proposed the ball detection algorithm by tracking the linear trajectory [12]. Junghye Min obtained the human motion trajectories by the color-optical flow and then analyzed the head and hand motion events [13]. The study presented in [14–16] used clustering-based approaches to detect anomalous video events.
With the development of Internet technology, a vast number of video data are available. Mining the hidden relationship among semantic concepts in video is important for effective content-based video retrieval and has gained great attention recently. Cloud computing, as a cost-effective solution, has become popular in mining video data for storing the distributed data and computations. In this paper, we have developed a novel method based on frequent pattern tree (FPTree) for mining association rules in video retrieval. The core of the method is to extend the structure of FPTree by temporal parameter in motion events. Firstly, we get semantic concepts based trajectory retrieval, Ncuts has been used to classify the sub-events by trajectory segmentation, and the sub-events in each event have been annotated. Secondly, the new modeling, called temporal frequent pattern tree (TFPTree) is used to store motion event semantic concepts. And we propose the TFP-Growth algorithm to mine temporal frequent patterns from TFPTree for finding the rules of the motion events. As video datasets grow large, cloud-based infrastructure has been used to support our computing. The experiment shows our method is both efficient and effective in improving the accuracy of semantic concept detection in video retrieval.
Video-object segmentation and 3D-trajectory estimation for monocular video sequences
2011, Image and Vision Computing
Citation Excerpt :
Many methods have been proposed for this problem. One popular method is to incorporate motion information into video-object segmentation by means of optical flow [3–8]. Although optical flow can provide a dense motion field, it has a limited ability to handle overlapped motion fields and large inter-frame motion.
In this paper, we describe a video-object segmentation and 3D-trajectory estimation method for the analysis of dynamic scenes from a monocular uncalibrated view. Based on the color and motion information among video frames, our proposed method segments the scene, calibrates the camera, and calculates the 3D trajectories of moving objects. It can be employed for video-object segmentation, 2D-to-3D video conversion, video-object retrieval, etc. In our method, reliable 2D feature motions are established by comparing SIFT descriptors among successive frames, and image over-segmentation is achieved using a graph-based method. Then, the 2D motions and the segmentation result iteratively refine each other in a hierarchically structured framework to achieve video-object segmentation. Finally, the 3D trajectories of the segmented moving objects are estimated based on a local constant-velocity constraint, and are refined by a Hidden Markov Model (HMM)-based algorithm. Experiments show that the proposed framework can achieve a good performance in terms of both object segmentation and 3D-trajectory estimation.
Discriminative human action recognition in the learned hierarchical manifold space
2010, Image and Vision Computing
In this paper, we propose a hierarchical discriminative approach for human action recognition. It consists of feature extraction with mutual motion pattern analysis and discriminative action modeling in the hierarchical manifold space. Hierarchical Gaussian Process Latent Variable Model (HGPLVM) is employed to learn the hierarchical manifold space in which motion patterns are extracted. A cascade CRF is also presented to estimate the motion patterns in the corresponding manifold subspace, and the trained SVM classifier predicts the action label for the current observation. Using motion capture data, we test our method and evaluate how body parts make effect on human action recognition. The results on our test set of synthetic images are also presented to demonstrate the robustness.
Motion segmentation method for hybrid characteristic on human motion
2009, Journal of Biomechanics
Citation Excerpt :
Joint angle is often adopted as a measure when researchers are dealing with ergonomic issues. Some papers divided joint angle into two- or three-dimensional space, especially for those video-based methods (Albu et al., 2008; Lu et al., 2000; Lu and Ferrier, 2004; Min et al., 2008; Niyogi and Adelson, 1994; Ormoneit et al., 2005; Polana and Nelson, 1997; Zhang and Troje, 2007). A hierarchical digital human model (DHM) was defined and the joint angle estimation was adopted according to the algorithm of hierarchical rotational matrix calculation mentioned in the previous study (Lau and Wong, 2007).
Motion segmentation and analysis are used to improve the process of classification of motion and information gathered on repetitive or periodic characteristic. The classification result is useful for ergonomic and postural safety analysis, since repetitive motion is known to be related to certain musculoskeletal disorders. Past studies mainly focused on motion segmentation on particular motion characteristic with certain prior knowledge on static or periodic property of motion, which narrowed method's applicability. This paper attempts to introduce a method to tackle human joint motion without having prior knowledge. The motion is segmented by a two-pass algorithm. Recursive least square (RLS) is firstly used to estimate possible segments on the input human-motion set. Further, period identification and extra segmentation process are applied to produce meaningful segments. Each of the result segments is modeled by a damped harmonic model, with frequency, amplitude and duration produced as parameters for ergonomic evaluation and other human factor studies such as task safety evaluation and sport analysis. Experiments show that the method can handle periodic, random and mixed characteristics on human motion, which can also be extended to the usage in repetitive motion in workflow and irregular periodic motion like sport movement.
Artificial Intelligence and Neural Network-Based Shooting Accuracy Prediction Analysis in Basketball
2021, Mobile Information Systems
Evaluation of neck motion due to change in working velocity based on feature extraction with motion division
2019, Advances in Intelligent Systems and Computing

View all citing articles on Scopus

¹: This work was done when Junghye Min was with Department of Computer Science and Engineering in Pennsylvania State University doing her Ph.D.

View full text

Extraction and temporal segmentation of multiple motion trajectories in human motion

Abstract

Introduction

Section snippets

Previous work

Extraction of motion trajectories

Temporal segmentation of motion trajectories

Description of data

Conclusions

Acknowledgments

CVGIP: Image Understanding

Computer Vision and Image Understanding

An algorithm for tracking multiple targets

IEEE Transactions on Automatic Control

An efficient implementation of Reid’s multiple hypothesis tracking algorithm and its evaluation for the purpose of visual tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence

A comparison of two algorithms for determining ranked assignments with application to multitarget tracking and motion correspondence

IEEE Transactions on Aerospace Electronic Systems

Motion trajectories

IEEE Transactions on Systems, Man, and Cybernetics

View-invariant representation and recognition of actions

International Journal of Computer Vision

Learning patterns of activity using real-time tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence

Probabilistic data association methods for tracking complex visual objects

IEEE Transactions on Pattern Analysis and Machine Intelligence

Resolving motion correspondence for densely moving points

IEEE Transactions on Pattern Analysis and Machine Intelligence

Parametric Hidden Markov Models for gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

Extraction of 2D motion trajectories and its application to hand gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

An HMM-based threshold model approach for gesture recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

Mean shift, mode seeking, and clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence

Mean shift: a robust approach toward feature space analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence

Kernel-based object tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence

A state-based approach to the representation and recognition of gestures

IEEE Transactions on Pattern Analysis and Machine Intelligence

Performance of optical flow techniques

International Journal of Computer Vision

A computational framework and an algorithm for the measurement of visual motion

International Journal of Computer Vision