Abstract
The goal of this paper is to evaluate how the fusion of audio and visual features can help in the challenging task of people identification based on their gait (i.e. the way they walk), or gait recognition. Most of previous research on gait recognition has focused on designing visual descriptors, mainly on binary silhouettes, or building sophisticated machine learning frameworks. However, little attention has been paid to audio patterns associated to the action of walking. So, we propose and evaluate here a multimodal system for gait recognition. The proposed approach is evaluated on the challenging ‘TUM GAID’ dataset, which contains audio recordings in addition to image sequences. The experimental results show that using late fusion to combine two kinds of tracklet-based visual features with audio features improves the state-of-the-art results on the standard experiments defined on the dataset.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia systems 16(6), 345–379 (2010)
Bach, F., Lanckriet, G., Jordan, M.: Multiple kernel learning, conic duality and the SMO algorithm. In: Proc. ICML, p. 6 (2004)
Castro, F.M., Marín-Jiménez, M., Medina-Carnicer, R.: Pyramidal Fisher Motion for multiview gait recognition. In: Proc. ICPR, pp. 1692–1697 (2014)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE Computer Society, Washington (2005)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Geiger, J.T., Kneißl, M., Schuller, B., Rigoll, G.: Acoustic Gait-based Person Identification using Hidden Markov Models. ArXiv e-prints (2014)
Geiger, J., Hofmann, M., Schuller, B., Rigoll, G.: Gait-based person identification by spectral, cepstral and energy-related audio features. In: Proc. ICASSP, pp. 458–462, May 2013
Guan, Y., Li, C.: A robust speed-invariant gait recognition system for walker and runner identification. In: Int. Conf. on Biometrics (ICB), pp. 1–8, June 2013
Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE PAMI 28(2), 316–322 (2006)
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits. J. of Visual Com. and Image Repres. 25(1), 195–206 (2014)
Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34(3), 334–352 (2004)
Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR, pp. 2555–2562 (2013)
KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-Based Surveillance Systems, pp. 135–144. Springer (2002)
Lartillot, O., Toiviainen, P.: MIR in Matlab (ii): A toolbox for musical feature extraction from audio. In: ISMIR, pp. 127–130 (2007)
Lin, Z., Chen, M., Ma, Y.: The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices (2010). arXiv preprint arXiv:1009.5055
Liu, Y., Zhang, J., Wang, C., Wang, L.: Multiple HOG templates for gait recognition. In: Proc. ICPR, pp. 2930–2933. IEEE (2012)
Marín-Jiménez, M., Muñoz Salinas, R., Yeguas-Bolivar, E., Pérez de la Blanca, N.: Human interaction categorization by using audio-visual cues. Machine Vision and Applications 25(1), 71–84 (2014)
Martín-Félez, R., Xiang, T.: Uncooperative gait recognition by learning to rank. Pattern Recognition 47(12), 3793–3806 (2014)
Osuna, E., Freund, R., Girosi, F.: Support Vector Machines: training and applications. Tech. Rep. AI-Memo 1602, MIT, March 1997
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action Recognition by Dense Trajectories. In: CVPR, pp. 3169–3176 (2011)
Whytock, T., Belyaev, A., Robertson, N.: Dynamic distance-based shape features for gait recognition. J. Math. Imaging and Vision 50(3), 314–326 (2014)
Ye, G., Liu, D., Jhuo, I.H., Chang, S.F.: Robust late fusion with rank minimization. In: CVPR, pp. 3021–3028 (2012)
Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. Proc. ICPR 4, 441–444 (2006)
Zeng, W., Wang, C., Yang, F.: Silhouette-based gait recognition via deterministic learning. Pattern Recognition 47(11), 3568–3584 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Castro, F.M., Marín-Jiménez, M.J., Guil, N. (2015). Empirical Study of Audio-Visual Features Fusion for Gait Recognition. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_61
Download citation
DOI: https://doi.org/10.1007/978-3-319-23192-1_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23191-4
Online ISBN: 978-3-319-23192-1
eBook Packages: Computer ScienceComputer Science (R0)