Empirical Study of Audio-Visual Features Fusion for Gait Recognition

Castro, Francisco M.; Marín-Jiménez, Manuel J.; Guil, Nicolás

doi:10.1007/978-3-319-23192-1_61

Empirical Study of Audio-Visual Features Fusion for Gait Recognition

Francisco M. Castro¹⁵,
Manuel J. Marín-Jiménez¹⁶ &
Nicolás Guil¹⁵

Conference paper
First Online: 01 January 2015

3103 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9256))

Abstract

The goal of this paper is to evaluate how the fusion of audio and visual features can help in the challenging task of people identification based on their gait (i.e. the way they walk), or gait recognition. Most of previous research on gait recognition has focused on designing visual descriptors, mainly on binary silhouettes, or building sophisticated machine learning frameworks. However, little attention has been paid to audio patterns associated to the action of walking. So, we propose and evaluate here a multimodal system for gait recognition. The proposed approach is evaluated on the challenging ‘TUM GAID’ dataset, which contains audio recordings in addition to image sequences. The experimental results show that using late fusion to combine two kinds of tracklet-based visual features with audio features improves the state-of-the-art results on the standard experiments defined on the dataset.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia systems 16(6), 345–379 (2010)
Article Google Scholar
Bach, F., Lanckriet, G., Jordan, M.: Multiple kernel learning, conic duality and the SMO algorithm. In: Proc. ICML, p. 6 (2004)
Google Scholar
Castro, F.M., Marín-Jiménez, M., Medina-Carnicer, R.: Pyramidal Fisher Motion for multiview gait recognition. In: Proc. ICPR, pp. 1692–1697 (2014)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE Computer Society, Washington (2005)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Chapter Google Scholar
Geiger, J.T., Kneißl, M., Schuller, B., Rigoll, G.: Acoustic Gait-based Person Identification using Hidden Markov Models. ArXiv e-prints (2014)
Google Scholar
Geiger, J., Hofmann, M., Schuller, B., Rigoll, G.: Gait-based person identification by spectral, cepstral and energy-related audio features. In: Proc. ICASSP, pp. 458–462, May 2013
Google Scholar
Guan, Y., Li, C.: A robust speed-invariant gait recognition system for walker and runner identification. In: Int. Conf. on Biometrics (ICB), pp. 1–8, June 2013
Google Scholar
Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE PAMI 28(2), 316–322 (2006)
Article Google Scholar
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits. J. of Visual Com. and Image Repres. 25(1), 195–206 (2014)
Article Google Scholar
Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34(3), 334–352 (2004)
Article Google Scholar
Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR, pp. 2555–2562 (2013)
Google Scholar
KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-Based Surveillance Systems, pp. 135–144. Springer (2002)
Google Scholar
Lartillot, O., Toiviainen, P.: MIR in Matlab (ii): A toolbox for musical feature extraction from audio. In: ISMIR, pp. 127–130 (2007)
Google Scholar
Lin, Z., Chen, M., Ma, Y.: The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices (2010). arXiv preprint arXiv:1009.5055
Liu, Y., Zhang, J., Wang, C., Wang, L.: Multiple HOG templates for gait recognition. In: Proc. ICPR, pp. 2930–2933. IEEE (2012)
Google Scholar
Marín-Jiménez, M., Muñoz Salinas, R., Yeguas-Bolivar, E., Pérez de la Blanca, N.: Human interaction categorization by using audio-visual cues. Machine Vision and Applications 25(1), 71–84 (2014)
Article Google Scholar
Martín-Félez, R., Xiang, T.: Uncooperative gait recognition by learning to rank. Pattern Recognition 47(12), 3793–3806 (2014)
Article Google Scholar
Osuna, E., Freund, R., Girosi, F.: Support Vector Machines: training and applications. Tech. Rep. AI-Memo 1602, MIT, March 1997
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)
Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action Recognition by Dense Trajectories. In: CVPR, pp. 3169–3176 (2011)
Google Scholar
Whytock, T., Belyaev, A., Robertson, N.: Dynamic distance-based shape features for gait recognition. J. Math. Imaging and Vision 50(3), 314–326 (2014)
Article MATH Google Scholar
Ye, G., Liu, D., Jhuo, I.H., Chang, S.F.: Robust late fusion with rank minimization. In: CVPR, pp. 3021–3028 (2012)
Google Scholar
Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. Proc. ICPR 4, 441–444 (2006)
Google Scholar
Zeng, W., Wang, C., Yang, F.: Silhouette-based gait recognition via deterministic learning. Pattern Recognition 47(11), 3568–3584 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Architecture, University of Malaga, Málaga, Spain
Francisco M. Castro & Nicolás Guil
Department of Computing and Numerical Analysis, University of Cordoba, Córdoba, Spain
Manuel J. Marín-Jiménez

Authors

Francisco M. Castro
View author publications
You can also search for this author in PubMed Google Scholar
Manuel J. Marín-Jiménez
View author publications
You can also search for this author in PubMed Google Scholar
Nicolás Guil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel J. Marín-Jiménez .

Editor information

Editors and Affiliations

University of Malta, Msida, Malta
George Azzopardi
University of Groningen, Groningen, The Netherlands
Nicolai Petkov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castro, F.M., Marín-Jiménez, M.J., Guil, N. (2015). Empirical Study of Audio-Visual Features Fusion for Gait Recognition. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_61

Download citation

DOI: https://doi.org/10.1007/978-3-319-23192-1_61
Published: 25 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23191-4
Online ISBN: 978-3-319-23192-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics