Skip to main content
Log in

Skeleton-based 3D human pose estimation with low-resolution infrared array sensor using attention based CNN-BiGRU

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Human sensing based on the low-resolution infrared sensor is widely used in hand gestures recognition, activity recognition, intrusion detection, etc. However, the information about humans acquired by the previous human sensing system using the infrared sensor is limited. In this paper, a human pose estimation system is proposed to realize the three-dimensional skeleton information acquisition by low-resolution infrared sensors. It is a difficult task to acquire human pose estimation with more rich human information from low-resolution infrared sensors. The system leverages the \(8 \times 8\) pixels low-resolution infrared array sensor to collect the activity data and the Kinect v2 camera to capture the three-dimensional skeleton of the human body as annotations of the infrared data. The convolutional neural network-bidirectional gated recurrent unit model with attention mechanism (CNN-BiGRU-AM) model is employed for model training to effectively extract the characteristics of the infrared data from spatial and temporal dimensions. The attention mechanism (AM) can improve the ability of the model to capture important local information. The bone joint point data predicted by the model are utilized to draw the three-dimensional skeleton diagram. The k-means clustering algorithm is applied to eliminate the outliers that affect the overall visualization effect in the prediction. The accuracy and completeness of human pose estimation are measured by the euclidean distance between the real coordinates of the bone joint points obtained by Kinect v2 camera and the coordinates predicted by the model. The proportion of the number of predictions with euclidean distance less than a threshold 20 mm is 90.151%, representing the accuracy of human pose estimation. The experimental results show that three-dimensional skeleton information can be acquired accurately by the low-resolution infrared array sensor and the subtle difference within each activity can be observed through the 3D human pose to improve the effect of activity recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability statement

The datasets generated during the current study are available from the corresponding author on reasonable request.

References

  1. Grützmacher F, Hein A, Kirste T, Haubelt C (2018) Model-based design of energy-efficient human activity recognition systems with wearable sensors. Technologies 6(4):89. https://doi.org/10.3390/technologies6040089

    Article  Google Scholar 

  2. I. Khokhlov, L. Reznik, J. Cappos, and R. Bhaskar (2018) Design of activity recognition systems with wearable sensors. In: Proceeding of IEEE Sensors Applications Symposium (SAS), Seoul, pp 226-231

  3. Biagetti G, Crippa P, Falaschetti L, Orcioni S, Turchetti C (2018) Human activity monitoring system based on wearable semg and accelerometer wireless sensor nodes. Biomed Eng 17(S1):132

    Google Scholar 

  4. Il-Young J, Hae R, Kim E, Lee H-W, Jung H (2018) Impact of a wearable device-based walking programs in rural older adults on physical activity and health outcomes: Cohort study. Jmir Mhealth Uhealth 6(11):e11335. https://doi.org/10.2196/11335

    Article  Google Scholar 

  5. Kekade S, Hseieh CH, Islam MM, Atique S, Shabbir SA (2018) The usefulness and actual use of wearable devices among the elderly population. Comput Methods Progr Biomed 153:137–159. https://doi.org/10.1016/j.cmpb.2017.10.008

    Article  Google Scholar 

  6. Babiker M, Khalifa OO, Htike KK, Hassan A, Zaharadeen M (2017) Automated daily human activity recognition for video surveillance using neural network. In: Proceeding of IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA), Putrajaya, pp 1-5

  7. Raya JAZ, Vázquez MSG, Méndez JCJ, Obeso, AM, Acosta ALR (2019) Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. In: Proceeding of Applications of Machine Learning, San Diego, p 1113909

  8. Kumar KS, Bhavani R (2018) Human activity recognition in egocentric video using hog, gist and color features. Multimed Tools Appl 79:3543–3559

    Article  Google Scholar 

  9. Wang W, Liu AX, Shahzad M, Ling K, Lu S (2017) Device-free human activity recognition using commercial wifi devices. IEEE J Sel Areas Commun 35(5):1118–1131. https://doi.org/10.1109/JSAC.2017.2679658

    Article  Google Scholar 

  10. Ibrahim OA, Keller J, Popescu M (2017) Context preserving representation of daily activities in elder care. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas, pp 547–551

  11. Bouazizi M, Ohtsuki T (2020) An infrared array sensor-based method for localizing and counting people for health care and monitoring. In: Proceeding of 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) in Conjunction with the 43rd Annual Conference of the Canadian Medical and Biological Engineering Society. Montreal, pp 4151–4155

  12. Tateno S, Meng F, Qian R, Hachiya Y (2020) Privacy-preserved fall detection method with three-dimensional convolutional neural network using low-resolution infrared array sensor. Sensors 20(20):5957. https://doi.org/10.3390/s20205957

    Article  Google Scholar 

  13. Muthukumar KA, Bouazizi M, Ohtsuki T (2021) A novel hybrid deep learning model for activity detection using wide-angle low-resolution infrared array sensor. IEEE Access 9:82563–82576. https://doi.org/10.1109/ACCESS.2021.3084926

    Article  Google Scholar 

  14. Kawashima T, Kawanishi Y, Ide I, Murase H, Kawade M (2017) Action recognition from extremely low-resolution thermal image sequence. In: Proceeding of 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, pp 1–6

  15. Gochoo M, Tan TH, Batjargal T, Seredin O, Huang SC (2018) Device-free non-privacy invasive indoor human posture recognition using low-resolution infrared sensor-based wireless sensor networks and dcnn. In: Proceeding of IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, pp 2311–2316

  16. Shih CS, Wang YT, Chou JJ (2020) Multiple-image super- resolution for networked extremely low-resolution thermal sensor array. In: Proceeding of IEEE Second Workshop on Machine Learning on Edge in Sensor Systems (SenSys-ML), Sydney pp 1–6

  17. Adolf J, Macas M, Lhotska L, Dolezal J (2018) Deep neural network based body posture recognitions and fall detection from low resolution infrared array sensor. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, pp 2394–2399

  18. Fujita H, Otsuka S (2018) Posture detection for elderly using infrared array sensor and fine tuning. In: Proceeding of IEEE Visual Communications and Image Processing (VCIP), Taichung, pp 1–4

  19. Yin C, Chen J, Miao X, Jiang H, Chen D (2021) Device-free human activity recognition with low-resolution infrared array sensor using long short-term memory neural network. Sensors 21(10):3551. https://doi.org/10.3390/s21103551

    Article  Google Scholar 

  20. Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: Proceeding of the European Conference on Computer Vision (ECCV), Amsterdam, pp 34–50

  21. Zhe C, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), USA, pp 7291–7299

  22. Tekin B, Marquez-Neila P, Salzmann M, Wei, Fua P (2017) Learning to fuse 2D and 3D image cues for monocular body pose estimation. In: Proceeding of IEEE International Conference on Computer Vision (ICCV), Italy, pp 3961–3970

  23. Wang K, Wang Q, Xue F, Chen W (2020) 3D-skeleton estimation based on commodity millimeter wave radar. In: Proceeding of IEEE 6th International Conference on Computer and Communications (ICCC), China, pp 1339–1343

  24. Ding W, Cao Z, Zhang J, Chen R, Guo X, Wang G (2021) Radar-based 3D human skeleton estimation by kinematic constrained learning. IEEE Sens 21(20):23174–23184. https://doi.org/10.1109/JSEN.2021.3107361

    Article  Google Scholar 

  25. Wang F, Zhou S, Panev S, Han J, Huang D (2019) Person-in-WiFi: fine-grained person perception using wifi. In: Proceeding of IEEE/CVF International Conference on Computer Vision (ICCV), Korea, pp 5451–5460

  26. Jiang W, Xue H, Miao C, Wang S, Lin S, Tian C, Murali S, Hu H, Su Z, Su L (2020) Towards 3D human pose construction using wifi. In: Proceeding of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom), United Kingdom, pp 1–14

  27. Iwata S, Kawanishi Y, Deguchi D, Ide I, Aizawa T(2021) LFIR2Pose: pose estimation from an extremely low-resolution FIR image sequence. In: Proceeding of 25th International Conference on Pattern Recognition (ICPR), Italy, pp 2597–2603

  28. Agustiono W, Utoyo MI, Rulaningtyas R, Satoto BD (2020) A modification of convolutional neural network layer to increase images classification accuracy. In: Proceeding of 6th Information Technology International Seminar (ITIS), Surabaya, pp 274–279

  29. Cho K, Merrienboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Qatar, pp 1724–1734

  30. Jabreel M, Moreno A (2017) Target-dependent sentiment analysis of tweets using a bi-directional gated recurrent unit. In: Proceeding of 13th International Conference on Web Information Systems and Technologies, Portugal, pp 80–87

  31. Liang R, Chang X, Jia P, Xu C (2020) Mine gas concentration forecasting model based on an optimized bigru network. ACS Omega 5(44):28579–28586. https://doi.org/10.1021/acsomega.0c03417

    Article  Google Scholar 

  32. Zhao X, Shao Y, Mai J, Yin A, Xu S (2020) Respiratory sound classification based on bigru-attention network with xgboost. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 915–920

Download references

Funding

This work was supported by the Natural Science Foundation of Fujian Province, China (No.2022J01566).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: JC; Methodology: JC, DC; Formal analysis and investigation: DC; Writing-original draft preparation: DC; Writing-review and editing: JC, HJ; Funding acquisition: JC, HJ, XM; Resources: JC HJ, XM; Supervision: JC, HJ.

Corresponding author

Correspondence to Hao Jiang.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Chen, D., Jiang, H. et al. Skeleton-based 3D human pose estimation with low-resolution infrared array sensor using attention based CNN-BiGRU. Int. J. Mach. Learn. & Cyber. 15, 2049–2062 (2024). https://doi.org/10.1007/s13042-023-02015-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-02015-0

Keywords

Navigation