Skip to main content
Log in

Random forest-based approach for physiological functional variable selection for driver’s stress level classification

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

This paper deals with physiological functional variables selection for driver’s stress level classification using random forests. Our analysis is performed on experimental data extracted from the drivedb open database available on PhysioNet website. The physiological measurements of interest are: electrodermal activity captured on the driver’s left hand and foot, electromyogram, respiration, and heart rate, collected from ten driving experiments carried out in three types of routes (rest area, city, and highway). The contributions of this work touch on the method as well as the application aspects. From a methodological viewpoint, the physiological signals are considered as functional variables, decomposed on a wavelet basis and then analyzed in search of most relevant variables. On the application side, the proposed approach provides a “blind” procedure for driver’s stress level classification, giving close performances to those resulting from the expert-based approach, when applied to the drivedb database. It also suggests new physiological features based on the wavelet levels corresponding to the functional variables wavelet decomposition. Finally, the proposed approach provides a ranking of physiological variables according to their importance in stress level classification. For the case under study, results suggest that the electromyogram and the heart rate signals are less relevant compared to the electrodermal and the respiration signals. Furthermore, the electrodermal activity measured on the driver’s foot was found more relevant than the one captured on the hand. Finally, the proposed approach also provided an order of relevance of the wavelet features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Akbas A (2011) Evaluation of the physiological data indicating the dynamic stress level of drivers. Sci Res Essays 6(2):430–439

    Google Scholar 

  • Alkali AH, Saatchi R, Elphick H, Burke D (2014) Short-time Fourier and wavelet transform analysis of respiration signal obtained by thermal imaging. In: 2014 9th International Symposium on Communication Systems, Networks & Digital Sign (CSNDSP). IEEE, pp 183–187. https://doi.org/10.1109/CSNDSP.2014.6923821

  • Auret L, Aldrich C (2011) Empirical comparison of tree ensemble variable importance measures. Chemometr Intell Lab Syst 105(2):157–170. https://doi.org/10.1016/j.chemolab.2010.12.004

    Article  Google Scholar 

  • Ayata D, Yaslan Y, Kamasak M (2016) Emotion recognition via random forest and galvanic skin response: comparison of time based feature sets, window sizes and wavelet approaches. In: 2016 Medical Technologies National Congress (TIPTEKNO). IEEE, pp 1–4. https://doi.org/10.1109/TIPTEKNO.2016.7863130

  • Bach FR (2008) Consistency of the group lasso and multiple kernel learning. J Mach Learn Res 9(Jun):1179–1225

    MathSciNet  MATH  Google Scholar 

  • Bostrom J (2005) Emotion-sensing PCs could feel your stress. PC World

  • Boucsein W (2012) Electrodermal activity. Springer, Berlin

    Book  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. The wadsworth and Brooks–Cole statistics-probability series. Taylor & Francis, London

    Google Scholar 

  • Breiman L, Cutler A (2015) Randomforest: Breiman and cutler’s random forests for classification and regression. R Package Version 46-12 http://cran.r-project.org/package=randomForest

  • Chaudhary R (2013) Electrocardiogram comparison of stress recognition in automobile drivers on matlab. Adv Electron Electr Eng 3(8):1007–1012

    Google Scholar 

  • Deng Y, Wu Z, Chu C, Yang T (2012) Evaluating feature selection for stress identification. In: Information Reuse and Integration (IRI), 2012 IEEE 13th international conference on, pp 584–591. https://doi.org/10.1109/IRI.2012.6303062

  • Díaz-Uriarte R, de Andrés SA (2006) Gene selection and classification of microarray data using random forest. BMC Bioinform 7(1):1–13. https://doi.org/10.1186/1471-2105-7-3

    Article  Google Scholar 

  • El Haouij N, Poggi JM, Sevestre-Ghalila S, Ghozi R, Jaïdane M (2018) AffectiveROAD system and database to assess driver’s attention. In: SAC 2018: symposium on applied computing, April 9–13, Pau. https://doi.org/10.1145/3167132.3167395

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice (springer series in statistics). Springer-Verlag New York Inc., Secaucus

    MATH  Google Scholar 

  • Genuer R, Poggi JM, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recognit Lett 31(14):2225–2236. https://doi.org/10.1016/j.patrec.2010.03.014

    Article  Google Scholar 

  • Genuer R, Poggi JM, Tuleau-Malot C (2015) VSURF: an R package for variable selection using random forests. R J 7(2):19–33

    Article  Google Scholar 

  • Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov P, Mark R, Mietus J, Moody G, Peng CK, Stanley H (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220

    Article  Google Scholar 

  • Granero AC, Fuentes-Hurtado F, Naranjo Ornedo V, Guixeres Provinciale J, Ausín JM, Alcañiz Raya M (2016) a Comparison of physiological signal analysis techniques and classifiers for automatic emotional evaluation of audiovisual contents. Front Comput Neurosci 10:74. https://doi.org/10.3389/fncom.2016.00074

    Google Scholar 

  • Gregorutti B (2016) RFgroove: importance measure and selection for groups of variables with random forests. R Package Version 11 http://cran.r-project.org/package=RFgroove

  • Gregorutti B, Michel B, Saint-Pierre P (2015) Grouped variable importance with random forests and application to multiple functional data analysis. Comput Stat Data Anal 90:15–35. https://doi.org/10.1016/j.csda.2015.04.002

    Article  MathSciNet  MATH  Google Scholar 

  • Gregorutti B, Michel B, Saint-Pierre P (2016) Correlation and variable importance in random forests. Stat Comput. https://doi.org/10.1007/s11222-016-9646-1

    MATH  Google Scholar 

  • Guendil Z, Lachiri Z, Maaoui C, Pruski A (2015) Emotion recognition from physiological signals using fusion of wavelet based features. In: 2015 7th International Conference on Modelling, Identification and Control (ICMIC), IEEE, pp 1–6. https://doi.org/10.1109/ICMIC.2015.7409485

  • Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422. https://doi.org/10.1023/A:1012487302797

    Article  MATH  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer New York Inc., New York

    MATH  Google Scholar 

  • Healey JA (2000) Wearable and automotive systems for affect recognition from physiology. Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science

  • Healey JA, Picard RW (2005) Detecting stress during real-world driving tasks using physiological sensors. IEEE Trans Intell Transp Syst 6(2):156–166. https://doi.org/10.1109/TITS.2005.848368

    Article  Google Scholar 

  • Horberry T, Anderson J, Regan MA, Triggs TJ, Brown J (2006) Driver distraction: the effects of concurrent in-vehicle tasks, road environment complexity and age on driving performance. Accid Anal Prev 38(1):185–191

    Article  Google Scholar 

  • Imam MH, Karmakar CK, Khandoker AH, Palaniswami M (2014) Effect of ECG-derived respiration (EDR) on modeling ventricular repolarization dynamics in different physiological and psychological conditions. Med Biol Eng Comput 52(10):851–860

    Article  Google Scholar 

  • Jolliffe I (2012) Principal Component Analysis. Springer, Berlin

    MATH  Google Scholar 

  • Karmakar C, Imam MH, Khandoker A, Palaniswami M (2014) Influence of psychological stress on QT interval. Computing in cardiology 2014:1009–1012

    Google Scholar 

  • Lin HP, Lin HY, Lin WL, Huang ACW (2011) Effects of stress, depression, and their interaction on heart rate, skin conductance, finger temperature, and respiratory rate: sympathetic-parasympathetic hypothesis of stress and depression. J Clin Psychol 67(10):1080–1091. https://doi.org/10.1002/jclp.20833

    Article  Google Scholar 

  • Louppe G, Wehenkel L, Sutera A, Geurts P (2013) Understanding variable importances in forests of randomized trees

  • Lykken DT (1972) Range correction applied to heart rate and to GSR data. Psychophysiology 9(3):373–379. https://doi.org/10.1111/j.1469-8986.1972.tb03222.x

    Article  Google Scholar 

  • Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7):674–693

    Article  MATH  Google Scholar 

  • Nicodemus KK, Malley JD, Strobl C, Ziegler A (2010) The behaviour of random forest permutation-based variable importance measures under predictor correlation. BMC Bioinform 11(1):1–13. https://doi.org/10.1186/1471-2105-11-110

    Article  Google Scholar 

  • Picard RW, Fedor S, Ayzenberg Y (2016) Multiple arousal theory and daily-life electrodermal activity asymmetry. Emot Rev 8(1):62–75. https://doi.org/10.1177/1754073914565517

    Article  Google Scholar 

  • Poggi JM, Tuleau C (2007) Classification of objectivization data using cart and wavelets. In: Proceedings of the IASC 07. Aveiro, pp 1–8

  • R Core Team (2016) R: A language and environment for statistical computing. In: R foundation for statistical computing. Vienna. www.r-project.org

  • Ramsay JO, Silverman BW (2002) Applied functional data analysis: methods and case studies, vol 77. Springer, New York

    Book  MATH  Google Scholar 

  • Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York. https://doi.org/10.1007/b98888

    MATH  Google Scholar 

  • Rigas G, Katsis C, Bougia P, Fotiadis D (2008) A reasoning-based framework for car drivers stress prediction. In: Control and automation, 2008 16th mediterranean conference on. pp 627–632. https://doi.org/10.1109/MED.2008.4602162

  • Sharma N, Gedeon T (2012) Objective measures, sensors and computational techniques for stress recognition and classification: a survey. Comput Methods Programs Biomed 108(3):1287–301. https://doi.org/10.1016/j.cmpb.2012.07.003

    Article  Google Scholar 

  • Sidek KA, Khalil I (2011) Automobile driver recognition under different physiological conditions using the electrocardiogram. PC World 38:753–756

    Google Scholar 

  • Singh RR, Conjeti S, Banerjee R (2012) Biosignal based on-road stress monitoring for automotive drivers. In: 2012 National Conference on Communications (NCC), IEEE, pp 1–5. https://doi.org/10.1109/NCC.2012.6176845

  • Singh M, Queyam AB (2013) Stress detection in automobile drivers using physiological parameters: a review. Int J Electron Eng 5(2):1–5

    Google Scholar 

  • Smart RG, Cannon E, Howard A, Frise P, Mann RE (2005) Can we design cars to prevent road rage? Int J Veh Inf Commun Syst 1(1–2):44–55. https://doi.org/10.1504/IJVICS.2005.007585

    Google Scholar 

  • Strobl C, Zeileis A (2008) Danger: high power!? exploring the statistical properties of a test for random forest variable importance. In: Proceedings of 18th international conference on computational statistics

  • Tao J, Tan T (2005) Affective computing: a review. In: International conference on affective computing and intelligent interaction. Springer, pp 981–995

  • Ullah S, Finch CF (2013) Applications of functional data analysis: a systematic review. BMC Med Res Methodol 13(1):43

    Article  Google Scholar 

  • Van Dooren M, De Vries JJ, Janssen JH (2012) Emotional sweating across the body: Comparing 16 different skin conductance measurement locations. Physiol Behav 106(2):298–304. https://doi.org/10.1016/j.physbeh.2012.01.020

    Article  Google Scholar 

  • Verikas A, Gelzinis A, Bacauskiene M (2011) Mining data with random forests: a survey and results of new tests. Pattern Recognit 44(2):330–349. https://doi.org/10.1016/j.patcog.2010.08.011

    Article  Google Scholar 

  • Yang K, Yoon H, Shahabi C (2005) A supervised feature subset selection technique for multivariate time series. In: Proceedings of the workshop on feature selection for data mining: interfacing machine learning with statistics, pp 92–101

  • Zhang L, Tamminedi T, Ganguli A, Yosiphon G, Yadegar J (2010) Hierarchical multiple sensor fusion using structurally learned Bayesian network. In: Wireless health 2010 on—WH ’10. ACM Press, New York, p 174. https://doi.org/10.1145/1921081.1921102

  • Zhu R, Zeng D, Kosorok MR (2012) Reinforcement learning trees. Technical reports on University of North Carolina

Download references

Acknowledgements

The authors gratefully acknowledge Dr. Chiraz Ben Abdelkader and Dr. Hassine Saidane for proofreading the paper. They also thank the anonymous referees for their useful suggestions and meaningful comments which led to a considerable improvement of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neska El Haouij.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

El Haouij, N., Poggi, JM., Ghozi, R. et al. Random forest-based approach for physiological functional variable selection for driver’s stress level classification. Stat Methods Appl 28, 157–185 (2019). https://doi.org/10.1007/s10260-018-0423-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-018-0423-5

Keywords

Mathematics Subject Classification

Navigation