Abstract
Human sensing based on the low-resolution infrared sensor is widely used in hand gestures recognition, activity recognition, intrusion detection, etc. However, the information about humans acquired by the previous human sensing system using the infrared sensor is limited. In this paper, a human pose estimation system is proposed to realize the three-dimensional skeleton information acquisition by low-resolution infrared sensors. It is a difficult task to acquire human pose estimation with more rich human information from low-resolution infrared sensors. The system leverages the \(8 \times 8\) pixels low-resolution infrared array sensor to collect the activity data and the Kinect v2 camera to capture the three-dimensional skeleton of the human body as annotations of the infrared data. The convolutional neural network-bidirectional gated recurrent unit model with attention mechanism (CNN-BiGRU-AM) model is employed for model training to effectively extract the characteristics of the infrared data from spatial and temporal dimensions. The attention mechanism (AM) can improve the ability of the model to capture important local information. The bone joint point data predicted by the model are utilized to draw the three-dimensional skeleton diagram. The k-means clustering algorithm is applied to eliminate the outliers that affect the overall visualization effect in the prediction. The accuracy and completeness of human pose estimation are measured by the euclidean distance between the real coordinates of the bone joint points obtained by Kinect v2 camera and the coordinates predicted by the model. The proportion of the number of predictions with euclidean distance less than a threshold 20 mm is 90.151%, representing the accuracy of human pose estimation. The experimental results show that three-dimensional skeleton information can be acquired accurately by the low-resolution infrared array sensor and the subtle difference within each activity can be observed through the 3D human pose to improve the effect of activity recognition.
Similar content being viewed by others
Data availability statement
The datasets generated during the current study are available from the corresponding author on reasonable request.
References
Grützmacher F, Hein A, Kirste T, Haubelt C (2018) Model-based design of energy-efficient human activity recognition systems with wearable sensors. Technologies 6(4):89. https://doi.org/10.3390/technologies6040089
I. Khokhlov, L. Reznik, J. Cappos, and R. Bhaskar (2018) Design of activity recognition systems with wearable sensors. In: Proceeding of IEEE Sensors Applications Symposium (SAS), Seoul, pp 226-231
Biagetti G, Crippa P, Falaschetti L, Orcioni S, Turchetti C (2018) Human activity monitoring system based on wearable semg and accelerometer wireless sensor nodes. Biomed Eng 17(S1):132
Il-Young J, Hae R, Kim E, Lee H-W, Jung H (2018) Impact of a wearable device-based walking programs in rural older adults on physical activity and health outcomes: Cohort study. Jmir Mhealth Uhealth 6(11):e11335. https://doi.org/10.2196/11335
Kekade S, Hseieh CH, Islam MM, Atique S, Shabbir SA (2018) The usefulness and actual use of wearable devices among the elderly population. Comput Methods Progr Biomed 153:137–159. https://doi.org/10.1016/j.cmpb.2017.10.008
Babiker M, Khalifa OO, Htike KK, Hassan A, Zaharadeen M (2017) Automated daily human activity recognition for video surveillance using neural network. In: Proceeding of IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA), Putrajaya, pp 1-5
Raya JAZ, Vázquez MSG, Méndez JCJ, Obeso, AM, Acosta ALR (2019) Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. In: Proceeding of Applications of Machine Learning, San Diego, p 1113909
Kumar KS, Bhavani R (2018) Human activity recognition in egocentric video using hog, gist and color features. Multimed Tools Appl 79:3543–3559
Wang W, Liu AX, Shahzad M, Ling K, Lu S (2017) Device-free human activity recognition using commercial wifi devices. IEEE J Sel Areas Commun 35(5):1118–1131. https://doi.org/10.1109/JSAC.2017.2679658
Ibrahim OA, Keller J, Popescu M (2017) Context preserving representation of daily activities in elder care. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas, pp 547–551
Bouazizi M, Ohtsuki T (2020) An infrared array sensor-based method for localizing and counting people for health care and monitoring. In: Proceeding of 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) in Conjunction with the 43rd Annual Conference of the Canadian Medical and Biological Engineering Society. Montreal, pp 4151–4155
Tateno S, Meng F, Qian R, Hachiya Y (2020) Privacy-preserved fall detection method with three-dimensional convolutional neural network using low-resolution infrared array sensor. Sensors 20(20):5957. https://doi.org/10.3390/s20205957
Muthukumar KA, Bouazizi M, Ohtsuki T (2021) A novel hybrid deep learning model for activity detection using wide-angle low-resolution infrared array sensor. IEEE Access 9:82563–82576. https://doi.org/10.1109/ACCESS.2021.3084926
Kawashima T, Kawanishi Y, Ide I, Murase H, Kawade M (2017) Action recognition from extremely low-resolution thermal image sequence. In: Proceeding of 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, pp 1–6
Gochoo M, Tan TH, Batjargal T, Seredin O, Huang SC (2018) Device-free non-privacy invasive indoor human posture recognition using low-resolution infrared sensor-based wireless sensor networks and dcnn. In: Proceeding of IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, pp 2311–2316
Shih CS, Wang YT, Chou JJ (2020) Multiple-image super- resolution for networked extremely low-resolution thermal sensor array. In: Proceeding of IEEE Second Workshop on Machine Learning on Edge in Sensor Systems (SenSys-ML), Sydney pp 1–6
Adolf J, Macas M, Lhotska L, Dolezal J (2018) Deep neural network based body posture recognitions and fall detection from low resolution infrared array sensor. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, pp 2394–2399
Fujita H, Otsuka S (2018) Posture detection for elderly using infrared array sensor and fine tuning. In: Proceeding of IEEE Visual Communications and Image Processing (VCIP), Taichung, pp 1–4
Yin C, Chen J, Miao X, Jiang H, Chen D (2021) Device-free human activity recognition with low-resolution infrared array sensor using long short-term memory neural network. Sensors 21(10):3551. https://doi.org/10.3390/s21103551
Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: Proceeding of the European Conference on Computer Vision (ECCV), Amsterdam, pp 34–50
Zhe C, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), USA, pp 7291–7299
Tekin B, Marquez-Neila P, Salzmann M, Wei, Fua P (2017) Learning to fuse 2D and 3D image cues for monocular body pose estimation. In: Proceeding of IEEE International Conference on Computer Vision (ICCV), Italy, pp 3961–3970
Wang K, Wang Q, Xue F, Chen W (2020) 3D-skeleton estimation based on commodity millimeter wave radar. In: Proceeding of IEEE 6th International Conference on Computer and Communications (ICCC), China, pp 1339–1343
Ding W, Cao Z, Zhang J, Chen R, Guo X, Wang G (2021) Radar-based 3D human skeleton estimation by kinematic constrained learning. IEEE Sens 21(20):23174–23184. https://doi.org/10.1109/JSEN.2021.3107361
Wang F, Zhou S, Panev S, Han J, Huang D (2019) Person-in-WiFi: fine-grained person perception using wifi. In: Proceeding of IEEE/CVF International Conference on Computer Vision (ICCV), Korea, pp 5451–5460
Jiang W, Xue H, Miao C, Wang S, Lin S, Tian C, Murali S, Hu H, Su Z, Su L (2020) Towards 3D human pose construction using wifi. In: Proceeding of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom), United Kingdom, pp 1–14
Iwata S, Kawanishi Y, Deguchi D, Ide I, Aizawa T(2021) LFIR2Pose: pose estimation from an extremely low-resolution FIR image sequence. In: Proceeding of 25th International Conference on Pattern Recognition (ICPR), Italy, pp 2597–2603
Agustiono W, Utoyo MI, Rulaningtyas R, Satoto BD (2020) A modification of convolutional neural network layer to increase images classification accuracy. In: Proceeding of 6th Information Technology International Seminar (ITIS), Surabaya, pp 274–279
Cho K, Merrienboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Qatar, pp 1724–1734
Jabreel M, Moreno A (2017) Target-dependent sentiment analysis of tweets using a bi-directional gated recurrent unit. In: Proceeding of 13th International Conference on Web Information Systems and Technologies, Portugal, pp 80–87
Liang R, Chang X, Jia P, Xu C (2020) Mine gas concentration forecasting model based on an optimized bigru network. ACS Omega 5(44):28579–28586. https://doi.org/10.1021/acsomega.0c03417
Zhao X, Shao Y, Mai J, Yin A, Xu S (2020) Respiratory sound classification based on bigru-attention network with xgboost. In: Proceeding of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 915–920
Funding
This work was supported by the Natural Science Foundation of Fujian Province, China (No.2022J01566).
Author information
Authors and Affiliations
Contributions
Conceptualization: JC; Methodology: JC, DC; Formal analysis and investigation: DC; Writing-original draft preparation: DC; Writing-review and editing: JC, HJ; Funding acquisition: JC, HJ, XM; Resources: JC HJ, XM; Supervision: JC, HJ.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, J., Chen, D., Jiang, H. et al. Skeleton-based 3D human pose estimation with low-resolution infrared array sensor using attention based CNN-BiGRU. Int. J. Mach. Learn. & Cyber. 15, 2049–2062 (2024). https://doi.org/10.1007/s13042-023-02015-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-02015-0