ABSTRACT
The focus of the paper is on studying five different methods to combine multi-view data from an uncalibrated smart camera network for human activity recognition. The multi-view classification scenarios studied can be divided to two categories: view selection and view fusion methods. Selection uses a single view to classify, whereas fusion merges multi-view data either on the feature- or label-level. The five methods are compared in the task of classifying human activities in three fully annotated datasets: MAS, VIHASI and HOMELAB, and a combination dataset MAS+VIHASI. Classification is performed based on image features computed from silhouette images with a binary tree structured classifier using 1D CRF for temporal modeling. The results presented in the paper show that fusion methods outperform practical selection methods. Selection methods have their advantages, but they strongly depend on how good of a selection criteria is used, and how well this criteria adapts to different environments. Furthermore, fusion of features outperforms other scenarios within more controlled settings. But the more variability exists in camera placement and characteristics of persons, the more likely improved accuracy in multi-view activity recognition can be achieved by combining candidate labels.
- }}Muhavi-mas dataset, http://dipersec.king.ac.uk/muhavi-mas.Google Scholar
- }}Srikanth Cherla, Kaustubh Kulkarni, Amit Kale, and V. Ramasubramanian. Towards fast, view-invariant human action recognition. Computer Vision and Pattern Recognition Workshop, 0:1--8, 2008.Google ScholarCross Ref
- }}Robert Collins, Ralph Gross, and Jianbo Shi. Silhouette-based human identification from body shape and gait. In Intl' Conference on Face and Gesture, pages 351--356, October 2002. Google ScholarDigital Library
- }}A. Farhadi, D. Forsyth, and R. White. Transfer learning in sign language. In Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on, pages 1--8, 17--22 2007.Google ScholarCross Ref
- }}Ali Farhadi and Mostafa Kamali Tabrizi. Learning to recognize activities from the wrong view point. In ECCV '08: Proceedings of the 10th European Conference on Computer Vision, pages 154--166, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- }}Hongzhe Han, Zhiliang Wang, Jiwei Liu, Zhengxi Li, Bin Li, and Zhongtao Han. Adaptive background modeling with shadow suppression. In Proc. of Intelligent Transportation Systems, pages 720--724, 2003.Google Scholar
- }}John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data, 2001.Google Scholar
- }}Yijuan Lu, Ira Cohen, Xiang Sean Zhou, and Qi Tian. Feature selection using principal feature analysis. In MULTIMEDIA '07: Proceedings of the 15th international conference on Multimedia, pages 301--304, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- }}Andrew Mccallum, Dayne Freitag, and Fernando Pereira. Maximum entropy markov models for information extraction and segmentation. pages 591--598. Morgan Kaufmann, 2000.Google Scholar
- }}Lawrence R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257--286, 1989.Google ScholarCross Ref
Index Terms
- On efficient use of multi-view data for activity recognition
Recommendations
Multi-view stacking for activity recognition with sound and accelerometer data
A multi-view stacking based method for sensor fusion in activity recognition.Home-tasks activities recognition from accelerometer and audio signals.Combining heterogeneous wearable sensors using stacked generalization. Many Ambient Intelligence (AmI) ...
Child-activity recognition from multi-sensor data
MB '10: Proceedings of the 7th International Conference on Methods and Techniques in Behavioral ResearchThe automatic recognition of child activity using multi-sensor data enables various applications such as child-development monitoring, energy-expenditure estimation, child-obesity prevention, child safety in and around the home, etc. We formulate the ...
A practical multi-sensor activity recognition system for home-based care
To cope with the increasing number of aging population, a type of care which can help prevent or postpone entry into institutional care is preferable. Activity recognition can be used for home-based care in order to help elderly people to remain at home ...
Comments