Skip to main content

Empirical Study of Audio-Visual Features Fusion for Gait Recognition

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9256))

Abstract

The goal of this paper is to evaluate how the fusion of audio and visual features can help in the challenging task of people identification based on their gait (i.e. the way they walk), or gait recognition. Most of previous research on gait recognition has focused on designing visual descriptors, mainly on binary silhouettes, or building sophisticated machine learning frameworks. However, little attention has been paid to audio patterns associated to the action of walking. So, we propose and evaluate here a multimodal system for gait recognition. The proposed approach is evaluated on the challenging ‘TUM GAID’ dataset, which contains audio recordings in addition to image sequences. The experimental results show that using late fusion to combine two kinds of tracklet-based visual features with audio features improves the state-of-the-art results on the standard experiments defined on the dataset.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia systems 16(6), 345–379 (2010)

    Article  Google Scholar 

  2. Bach, F., Lanckriet, G., Jordan, M.: Multiple kernel learning, conic duality and the SMO algorithm. In: Proc. ICML, p. 6 (2004)

    Google Scholar 

  3. Castro, F.M., Marín-Jiménez, M., Medina-Carnicer, R.: Pyramidal Fisher Motion for multiview gait recognition. In: Proc. ICPR, pp. 1692–1697 (2014)

    Google Scholar 

  4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE Computer Society, Washington (2005)

    Google Scholar 

  5. Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Geiger, J.T., Kneißl, M., Schuller, B., Rigoll, G.: Acoustic Gait-based Person Identification using Hidden Markov Models. ArXiv e-prints (2014)

    Google Scholar 

  8. Geiger, J., Hofmann, M., Schuller, B., Rigoll, G.: Gait-based person identification by spectral, cepstral and energy-related audio features. In: Proc. ICASSP, pp. 458–462, May 2013

    Google Scholar 

  9. Guan, Y., Li, C.: A robust speed-invariant gait recognition system for walker and runner identification. In: Int. Conf. on Biometrics (ICB), pp. 1–8, June 2013

    Google Scholar 

  10. Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE PAMI 28(2), 316–322 (2006)

    Article  Google Scholar 

  11. Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits. J. of Visual Com. and Image Repres. 25(1), 195–206 (2014)

    Article  Google Scholar 

  12. Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34(3), 334–352 (2004)

    Article  Google Scholar 

  13. Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR, pp. 2555–2562 (2013)

    Google Scholar 

  14. KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-Based Surveillance Systems, pp. 135–144. Springer (2002)

    Google Scholar 

  15. Lartillot, O., Toiviainen, P.: MIR in Matlab (ii): A toolbox for musical feature extraction from audio. In: ISMIR, pp. 127–130 (2007)

    Google Scholar 

  16. Lin, Z., Chen, M., Ma, Y.: The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices (2010). arXiv preprint arXiv:1009.5055

  17. Liu, Y., Zhang, J., Wang, C., Wang, L.: Multiple HOG templates for gait recognition. In: Proc. ICPR, pp. 2930–2933. IEEE (2012)

    Google Scholar 

  18. Marín-Jiménez, M., Muñoz Salinas, R., Yeguas-Bolivar, E., Pérez de la Blanca, N.: Human interaction categorization by using audio-visual cues. Machine Vision and Applications 25(1), 71–84 (2014)

    Article  Google Scholar 

  19. Martín-Félez, R., Xiang, T.: Uncooperative gait recognition by learning to rank. Pattern Recognition 47(12), 3793–3806 (2014)

    Article  Google Scholar 

  20. Osuna, E., Freund, R., Girosi, F.: Support Vector Machines: training and applications. Tech. Rep. AI-Memo 1602, MIT, March 1997

    Google Scholar 

  21. Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  22. Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)

    Google Scholar 

  23. Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/

  24. Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action Recognition by Dense Trajectories. In: CVPR, pp. 3169–3176 (2011)

    Google Scholar 

  25. Whytock, T., Belyaev, A., Robertson, N.: Dynamic distance-based shape features for gait recognition. J. Math. Imaging and Vision 50(3), 314–326 (2014)

    Article  MATH  Google Scholar 

  26. Ye, G., Liu, D., Jhuo, I.H., Chang, S.F.: Robust late fusion with rank minimization. In: CVPR, pp. 3021–3028 (2012)

    Google Scholar 

  27. Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. Proc. ICPR 4, 441–444 (2006)

    Google Scholar 

  28. Zeng, W., Wang, C., Yang, F.: Silhouette-based gait recognition via deterministic learning. Pattern Recognition 47(11), 3568–3584 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel J. Marín-Jiménez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Castro, F.M., Marín-Jiménez, M.J., Guil, N. (2015). Empirical Study of Audio-Visual Features Fusion for Gait Recognition. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23192-1_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23191-4

  • Online ISBN: 978-3-319-23192-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics