Skip to main content
Log in

A temporal belief-based hidden markov model for human action recognition in medical videos

  • Applied Problems
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

In the context of human action recognition from video sequences in the medical environment, a Temporal Belief-based Hidden Markov Model (HMM) is presented. It allows to cope with human action temporality and enables to manage the data uncertainty and the knowledge incompleteness. The system of activity recognition is based on an HMM with explicit state duration. The global interpretation process uses the framework of the Transferable Belief Model (TBM). It enable us to model and manage the uncertainty over the video interpretation process. An application is proposed for human action analysis in medical video sequences provided by a patient monitoring system in the cardiology section in hospital. The proposed recognition method has been assessed on a database of 3000 video images of medical scenes and compared to the performance of the probabilistic Hidden Markov Models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Ziani, C. Motamed, and J. C. Noyer, “Temporal reasoning for scenario recognition in video-surveillance using Bayesian networks,” IET Comput. Vision 2 (2), 99–107 (2008).

    Article  Google Scholar 

  2. A.-S. Silvent, M. Dojat, and C. Garbay, “Multi-level temporal abstraction for medical scenario construction,” Int._J. Adap. Control Signal Process 19, 377–394 (2005).

    Article  MATH  MathSciNet  Google Scholar 

  3. A. Wilson and A. Bobick. “Hidden Markov models for modeling and recognizing gesture under variation,” Int. J. Pattern Recogn. Artificial Intellig. 15 (1), 123–160, (2001).

    Article  Google Scholar 

  4. E. Ramasso, “Contribution of belief functions to Hidden Markov Models,” in Proc. IEEE Workshop on Machine Learning and Signal Processing (Grenoble, 2009), pp.1–6.

    Google Scholar 

  5. S. Luhr, S. Venkatesh, G. West, and H. H. Bui, “Duration abnormality detection in sequences of human activity,” in Proc. PRICAI 2004: Trends in Artificial Intelligence Lecture Notes in Computer Science (2004), Vol. 3157, pp. 983–984.

    Google Scholar 

  6. S. Luhr, S. Venkatesh, G. West, and H. H. Bui, “Explicit state duration HMM for abnormality detection in sequences of human activity,” in Proc. PRICAI 2004: Trends in Artificial Intelligence Lecture Notes in Computer Science (2004).

    Book  Google Scholar 

  7. C. Tessier, “Towards a commonsense estimator for activity tracking,” in Proc. AAAI Spring Symp. (Palo Alto, CA, 2003), pp. 111–119.

    Google Scholar 

  8. T. Vu, F. Bremond, and M. Thonnat, “Automatic video interpretation: A novel algorithm for temporal scenario recognition,” in Proc. 18th Int. Joint Conf. on Artificial Intelligence (IJCAI’03) (Acapulco, 2003), pp. 9–15.

    Google Scholar 

  9. C. Dousson, and P. Le Maigat, “Chronicle recognition improvement using temporal focusing and hierarchization,” in Proc. IJCAI 2007 (Hyderabad, 2007), pp. 324–329.

    Google Scholar 

  10. A. Ziani and C. Motamed, “Automatic scenario recognition for visual-surveillance by combining probabilistic graphical approaches,” in Video Surveillance, Ed. by Weiyao Lin (InTech, 2011).

  11. V. T. Vu, “Temporel scenario for automatic video interpretation,” PhD Thesis (INRIA, Sophia-Antipolis, 2004).

    Google Scholar 

  12. N. Rota and M. Thonnat, “Activity recognition from video sequences using declarative models,” in Proc. 14th European Conf. on Artificial Intelligence (ECAI), Ed. by W. Horn (IOS Press, Amsterdam, Berlin, 2000).

    Google Scholar 

  13. V.-T. Vu, F. Bremond, and M. Thonnat, “Temporal constraints for video interpretation,” in Proc. 15th European Conf. on Artificial Intelligence (ECAI’2002) (Lyon, 2002).

    Google Scholar 

  14. R. Gerber, H. Nagel, and H. Schreiber, “Deriving textual descriptions of road traffic queues from video sequences,” in Proc. 15th European Conf. on Artificial Intelligence (ECAI) (Lyon, 2002), pp. 736–740.

    Google Scholar 

  15. M. Thonnat and N. A. Rota, “Image understanding for visual surveillance applications,” in Proc. Int. Workshop on Cooperative Distributed Vision (Kyoto, 1999), pp. 51–82.

    Google Scholar 

  16. J. Hobbs, R. Nevatia, and B. Bolles, “An ontology for video event representation,” in Proc. IEEE Workshop on Event Detection and Recognition, WEDR’04 (Washington, 2004).

    Google Scholar 

  17. M. Thonnat, F. Bremond, N. Maillot, and Van Th. Vu, “Ontologies for video events,” in Tech. Rep. (INRIA, Sophia-Antipolis, 2004), No. 5189.

  18. L. I. Kuncheva, “Classifier ensembles for changing environments,” in Proc. Int. Workshop on Multiple Classifier Systems (Cagliari, 2004), Vol. 3077, pp. 1–15.

    Google Scholar 

  19. H. H. Bui, Q. Phung, and S. Venkatesh, “Hierarchical Hidden Markov Models with general state hierarchy,” in Proc. 18th National Conf. on Artificial Intelligence, (San Jose, CA, 2004), pp. 324–329.

    Google Scholar 

  20. S. Hongeng; F. Bremond, and R. Nevatia, “Bayesian framework for video surveillance application,” in Proc. ICPR00 (Barcelona, 2000), pp. 164–170.

    Google Scholar 

  21. L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE 77, 257–285 (1989).

    Article  Google Scholar 

  22. A. J. Howell and H. Buxton. “Learning identity with radial basis function networks,” Neurocomputing 20, 15–34 (1998).

    Article  Google Scholar 

  23. K. Altun, B. Barshan, and O. Tunél, “Comparative study on classifying human activities with miniature inertial and magnetic sensors,” Pattern Recogn. 43 (10), 3605–3620 (2010).

    Article  MATH  Google Scholar 

  24. J. He, H. Li, and J. Tan, “Real-time daily activity classification with wireless sensor networks using hidden Markov model,” in Proc. 29th Annu. Int. Conf. of the IEEE Engineering in Medicine and Biology Society (Lyon, 2007), pp. 3192–3195.

    Google Scholar 

  25. M. C. Yuksek and B. Barshan, “Human activity classification with miniature inertial and magnetic sensors,” in Proc. 9th European Signal Processing Conf. (Barcelona, 2011), pp. 956–960.

    Google Scholar 

  26. A. Sant’Anna, “A symbolic approach to human motion analysis using inertial sensors: framework and gait analysis study,” in PhD Thesis (Halmstad Univ., 2012).

    Google Scholar 

  27. A. J. Howell and H. Buxton, “RBF network methods for face detection and attentional frames,” Neural Processing Lett. 15 (3), 197–211 (2002).

    Article  MATH  Google Scholar 

  28. H. Buxton and G. Shaogang, “Advanced visual surveillance using Bayesian networks,” in Proc. Int. Conf. on Computer Vision (Cambridge, MA, 1995).

    Google Scholar 

  29. R. J. Howarth and H. Buxton, “Conceptual descriptions from monitoring et watching image sequences,” Image Vision Comput. 18, 105–135 (2000).

    Article  Google Scholar 

  30. T. Tan, P. Remagnino, and K. Baker, “Agent orientated annotation in model based visual surveillance,” in Proc. Int. Conf. on Computer Vision (Bombay, 1998), pp. 857–862.

    Google Scholar 

  31. S. Hongeng and R. Nevatia, “Multi-agent event recognition,” in Proc. Int. Conf. on Computer Vision, ICCV’01 (Vancouver, July 2001).

    Book  Google Scholar 

  32. R., Nevatia, S. Hongeng, and F. Bremond, “Videobased event recognition: Activity representation and probabilistic recognition methods,” Comput. Vision Image Understand. 96 (2),129–162 (2003).

    Google Scholar 

  33. G. West, P. Peursum, and S. Venkatesh, “Combining image regions and human activity for indirect object recognition in indoor wide-angle views,” in Proc. IEEE Int. Conf. on Computer Vision (Beijing, 2005), Vol. 1, pp. 82–89.

    Google Scholar 

  34. J. Ohya, J. Yamato, and K. Ishii, “Recognizing human action in time-sequential images using hidden Markov model,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’92) (Champaign, IL, 1992), pp. 379–385.

    Google Scholar 

  35. A. Kokaram, N. Rea, R. Dahyot, M. Tekalp, P. Bouthemy, P. Gros, and I. Sezan, “Browsing sports video (trends in sports-related indexing and retrieval work),” IEEE Signal Processing Mag. 23 (2), 47–58 (2006).

    Article  Google Scholar 

  36. M. Barnard and J.-M. Odobez, “Sports event recognition using layered HMMs,” in Proc. IDIAPRR (2005), pp. 05–07.

    Google Scholar 

  37. Y. Bengio and P. Frasconi, “An input output HMM architecture,” Adv. Neural Inf. Processing Syst. 7 (NIPS’9), 427–434 (1995).

    Google Scholar 

  38. Y. A. Ivanov and A. F. Bobick, “Recognition of visual activities and interactions by stochastic parsing,” IEEE Trans. Pattern Anal. Mach. Intellig. (PAMI) 22 (8), 852–872 (2000).

    Article  Google Scholar 

  39. M. Oliver, B. Rosario, and A. Pentland, “A Bayesian computer vision system for modeling human interactions,” IEEE Trans. Pattern Anal. Mach. Intellig. 22 (8), 831–843 (2000).

    Article  Google Scholar 

  40. A. Galata, N. Johnson, and D. Hogg, “Learning variable length Markov models of behaviour,” Int. J. Comp. Vision Image Understand. 81 (3), 398–413 (2001).

    Article  MATH  Google Scholar 

  41. F. Porikli, “Trajectory distance metric using hidden Markov model based representation,” in Proc. PETS Workshop (Prague, 2004).

    Google Scholar 

  42. M. Duong, T. Vu, H. H. Bui, H. Phung, Q. Dinh, and V. Svetha, “Activity recognition and abnormality detection with the switching hidden semi-Markov model,” in Proc. IEEE Computer Society Conf. CVPR 2005 (San Diego, 2005).

    Book  Google Scholar 

  43. M. Duong, T. Vu, H. H. Bui, H. Phung, Q. Dinh, and V. Svetha, “Efficient duration and hierarchical modeling for human activity recognition,” Artificial Intellig. 173 (7–8), 830–856 (Elsevier BV, Amsterdam, 2009).

    Article  Google Scholar 

  44. M. Dong and D. He, “A segmental hidden semiMarkov model (HSMM) based diagnostics and prognostics framework and methodology,” Mech. Syst. Signal Processing 21, 2248–2266 (2007).

    Article  Google Scholar 

  45. S. Fine, Y. Singer, and N. Tishby, “The hierarchical hidden Markov model: analysis and applications,” Mach. Learn. 32 (1), 41–62 (1998).

    Article  MATH  Google Scholar 

  46. H. H. Bui, S. Venkatesh, and G. West, “Policy recognition in the abstract hidden Markov model,” J. Artificial Intellig. Res. 17, 451–499 (2002).

    MATH  MathSciNet  Google Scholar 

  47. N. Oliver, E. Horvitz, and A. Garg, “Layered representations for human activity recognition,” in Proc. 4th IEEE Int. Conf. on Multimodal Interfaces (ICMI’02) (Pittsburgh, 2002), pp. 3–8.

    Google Scholar 

  48. K. P. Murphy, Hidden Semi-Markov Models (HSMMs), Unpublished Notes (2002).

    Google Scholar 

  49. M. J. Russel, and R. K. Moore, “Explicit Modeling of State Occupancy in Hidden Markov Models for Speech Recognition,” in Proc. ICASSP (Tampa, FL, 1985), pp. 5–8.

    Google Scholar 

  50. P. Smets and R. Kennes, “The transferable belief model,” Artificial Intellig. 66, 191–234 (1994).

    Article  MATH  MathSciNet  Google Scholar 

  51. E. Ramasso. “Contribution of belief functions to Hidden Markov Models,” in Proc. IEEE Workshop on Machine Learning and Signal Processing (Grenoble, 2009), pp. 1–6.

    Google Scholar 

  52. S. R. M. Ahouandjinou, E. C. Ezin, C. Motamed, and P. Gouton, “An approach to correcting image distortion by self calibration stereoscopic scene from multiple views,” in Proc. SITIS 2012 (Sorrento, 2012), pp. 389–394.

    Google Scholar 

  53. E. Ramasso, M. Rombaut, and D. Pellerin, “Forwardbackward-Viterbi procedures in TBM for state sequence analysis using belief functions,” in Proc. Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (Hammamet, 2007), pp. 405–417.

    Google Scholar 

  54. N. Zouba, F. Bremond, and M. Thonnat, “An activity monitoring system for real elderly at home: Validation study,” in Proc. 7th IEEE Int. Conf. on Advanced Video and Signal-Based Surveillance, AVSS (Boston, 2010).

    Google Scholar 

  55. P. Robert, E. Castelli, P. C. Chung, T. Chiroux, C. F. Crispim-Junior, P. Mallea, and F. Bremond, “SWEET HOME ICT technologies for the assessment of elderly subjects,” in Ref. No.: IRBM-D-13-00003 (IRBM BioMedical Engineering and Research, 2013).

    Google Scholar 

  56. L. Serir, E. Ramasso, and N. Zerhouni, “Time-sliced temporal evidential networks: the case of evidential HMM with application to dynamical system analysis,” IEEE Int. Conf. on Prognostics and Health Management, PHM’11 (Denver, 2011).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. S. R. M. Ahouandjinou.

Additional information

The article is published in the original.

This paper uses the materials of the report submitted at the 11th International Conference “Pattern Recognition and Image Analysis: New Information Technologies,” Samara, Russia, September 23–28, 2013.

Arnaud S. R. M. Ahouandjinou Received his BSc in Electrical Engineering and Industrial Computing, and M.Sc in Network Engineering and Computer Science from the Ecole Superieure de Genie Informatique (ESGI), Paris, France. He received Master’s Research degree in Mathematics applied to Engineering Sciences in specialty Ingenierie Numerique, Signal, Image and Informatique Industrielle in 2010 and since this year, he is PhD student in computer science, signal, and image processing at University of Littoral Cote d’Opale, Calais, France. Current research is concerned with the automatic visual-surveillance of wide area scenes using computational vision in medical fields. His research interests focus on the design of multicamera system for realtime for human action recognition with the management of data uncertainty over the vision system by using probabilistic graphical models, and beliefs propagation (TBM Framework). He is recently interested by machine learning approaches for human activity recognition through a multisensor system which provide the solution for using the collaboration and the cooperation between these sensors to improve the high-level video interpretation process.

Cina Motamed is associate professor in Computer Science in the University of Littoral Cote d’Opale, Calais, France. He received his BSc in mathematics, and MSc in Electrical Engineering and Computer Science from the University of Caen, France and the PhD degree in Computer Science from the University of Compiegne, France, in 1987, 1989, and 1992, respectively. Current research is concerned with the automatic visual surveillance of wide area scenes using computational vision. His research interests focus on the design of multicamera system for real-time multiobject tracking and human action recognition. He is recently focusing on the uncertainty management over the vision system by using graphical models, and beliefs propagation. He is also interested by unsupervised learning approaches for human activity recognition.

Eugène C. Ezin received his PhD degree with highest level of distinction in 2001 after research works carried out on neural and fuzzy systems for speech applications at the International Institute for Advanced Scientific Studies in Italy. Since July 2012, he has been an associate professor in computer science. He supervised many master thesis in the same field. He is a reviewer of Mexican International Conference on Artificial Intelligence and other journals. His research interests include machine learning, neural networks and fuzzy systems, signal and image processing, high performance computing, cryptography, modeling and simulation, information systems and network security. He is also interested by human activity recognition through multisensor systems.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahouandjinou, A.S.R.M., Motamed, C. & Ezin, E.C. A temporal belief-based hidden markov model for human action recognition in medical videos. Pattern Recognit. Image Anal. 25, 389–401 (2015). https://doi.org/10.1134/S1054661815030025

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661815030025

Keywords

Navigation