Abstract
In human activity recognition studies it is important to identify an optimal set with the minimum number of features that will potentially improve the recognition rate. In the current paper we introduce a promising feature selection method that exploits the differences on the correlation structure of the features, between the different classes of the target variable. Using the recordings of triaxial accelerometers and gyroscopes, we extracted several features and created subsets according to the activities performed. For each subset, we calculated the pairwise correlation coefficients of the features and compared the feature correlations of different subsets. By identifying the significantly different correlations we ranked the variables participating in those correlations based on their frequency of appearance and thus created a subset of features that will optimize the performance of a classification algorithm. The method allows the researcher to select the desired number of features to be included in the classification. Two publicly available datasets were used to evaluate the performance of the proposed methodology in binary and multiclass classification problems. The evaluation revealed quite promising results of the methodology that was compared to the performance of the whole feature set and of a feature selection method that has been extensively used in activity recognition studies.
References
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public domain dataset for human activity recognition using smartphones. In: Esann
Capela NA, Lemaire ED, Baddour N (2015) Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients. PLoS One 10(4):e0124414
Chen L, Hoey J, Nugent CD, Cook DJ, Yu Z (2012) Sensor-based activity recognition. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(6):790–808
Chowdhury AK, Tjondronegoro D, Chandran V, Trost SG (2017) Physical activity recognition using posterior-adapted class-based fusion of multiaccelerometer data. IEEE J Biomed Health Inf 22(3):678–685
Diedenhofen B, Musch J (2015) Cocor: a comprehensive solution for the statistical comparison of correlations. PLoS One 10(4):e0121945
Dobbins C, Rawassizadeh R (2018) Towards clustering of mobile and smartwatch accelerometer data for physical activity recognition. In: Informatics, vol 5, no 2. Multidisciplinary Digital Publishing Institute
Dunn OJ (1958) Estimation of the means of dependent variables. Ann Math Stat:1095–1111
Fish B, Khan A, Chehade NH, Chien C, Pottie G (2012) Feature selection based on mutual information for human activity recognition. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1729–1732
Fisher RA (1992) Statistical methods for research workers. In: Breakthroughs in statistics. Springer, New York, NY, pp 66–70
Hall MA, Smith LA (1999) Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In: FLAIRS conference, vol 1999, pp 235–239
Jarraya A, Arour K, Bouzeghoub A, Borgi A (2017) Feature selection based on Choquet integral for human activity recognition. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, pp 1–6
Jatoba LC, Grossmann U, Kunze C, Ottenbacher J, Stork W (2008) Context-aware mobile health monitoring: Evaluation of different pattern recognition methods for classification of physical activity. In: 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, pp 5250–5253
Kendall MG (1948) Rank correlation methods
Lara OD, Labrador MA (2012) A survey on human activity recognition using wearable sensors. IEEE Commun Surv Tutor 15(3):1192–1209
Luštrek M, Kaluža B (2009) Fall detection and activity recognition with machine learning. Informatica 33(2)
Mangai UG, Samanta S, Das S, Chowdhury PR (2010) A survey of decision fusion and feature fusion strategies for pattern classification. IETE Tech Rev 27(4):293–307
Maurer U, Smailagic A, Siewiorek DP, Deisher M (2006) Activity recognition and monitoring using multiple sensors on different body positions. In: International workshop on wearable and implantable body sensor networks (BSN'06). IEEE, 4 p
Olkin I, Finn JD (1995) Correlations redux. Psychol Bull 118(1):155
Pearson K (1895) Note on regression and inheritance in the case of two parents. Proc R Soc Lond 58:240–242
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Peng JX, Ferguson S, Rafferty K, Kelly PD (2011) An efficient feature selection method for mobile devices with application to activity recognition. Neurocomputing 74(17):3543–3552
Ravi N, Dandekar N, Mysore P, Littman ML (2005) Activity recognition from accelerometer data. In: AAAI, vol 5, no 2005, pp 1541–1546
Revelle W, Revelle MW (2015) Package ‘psych’. The comprehensive R archive network
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101
Stisen A, Blunck H, Bhattacharya S, Prentow TS, Kjærgaard MB, Dey A et al (2015) Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In: Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems. ACM, New York, pp 127–140
Suto J, Oniga S, Sitar PP (2016) Comparison of wrapper and filter feature selection algorithms on human activity recognition. In: 2016 6th international conference on computers communications and control (ICCCC). IEEE, pp 124–129
Tsanousa A, Ntoufa S, Papakonstantinou N, Stamatopoulos K, Angelis L (2019) Study of gene expressions' correlation structures in subgroups of Chronic Lymphocytic Leukemia Patients. J Biomed Inform 95:103211
Uddin MT, Uddiny MA (2015) A guided random forest based feature selection approach for activity recognition. In: 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT). IEEE, pp 1–6
Ustev YE, Incel DO, Ersoy C (2013) User, device and orientation independent human activity recognition on mobile phones: Challenges and a proposal. In: Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication. ACM, New York, pp 1427–1436
Wang A, Chen G, Wu X, Liu L, An N, Chang CY (2018) Towards human activity recognition: a hierarchical feature selection framework. Sensors 18(11):3629
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of The Twentieth International Conference on Machine Leaning (ICML-03). Washington, D.C. pp 856–863 (August 21–24, 2003)
Zou GY (2007) Toward using confidence intervals to compare correlations. Psychol Methods 12(4):399
Acknowledgements
This research has been cofinanced by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH - CREATE - INNOVATE (project code:T1EDK-00686)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tsanousa, A., Meditskos, G., Vrochidis, S. et al. A novel feature selection method based on comparison of correlations for human activity recognition problems. J Ambient Intell Human Comput 11, 5961–5975 (2020). https://doi.org/10.1007/s12652-020-01836-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-01836-z