Abstract
In this study, we aim to better the user experience of the visually impaired when navigating in unfamiliar outdoor environments assisted by mobility technologies. We propose a framework for assessing their cognitive-emotional experience based on ambulatory monitoring and multimodal fusion of electroencephalography, electrodermal activity, and blood volume pulse signals. The proposed model is based on a random forest classifier which successfully infers in an automatic way the correct urban environment among eight predefined categories (AUROC 93 %). Geolocating the most predictive multimodal features that relate to cognitive load and stress, we provide further insights into the relationship of specific biomarkers with the environmental/situational factors that evoked them.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
Mobility in urban areas can be a challenging and emotionally stressful task for visually impaired people (VIP), especially when navigating in unfamiliar environments. Despite an increasing number of assistive technologies that help individuals with sight loss to augment their spatial awareness and wayfinding abilities when in move, very few systems provide a high degree of independence beyond known environments that would allow VIP to significantly achieve mobility and integrate in everyday active life [14, 17]. Placing the visually impaired in the center of attention and exploiting recent developments in physiological computing and wearable wireless sensor devices, an extensive study was designed to better understand how people with sight loss perceive and interact with the urban space as manifested in their management of cognitive load and stress.
Orientation and mobility (O&M) in humans heavily relies on sight, which provides instantaneous, effortless access to anticipatory (e.g., stairs, turns, signs) and proactive (e.g., moving people, poles) information at various distances simultaneously [20]. Visually impaired pedestrians learn to obtain critical environmental information primarily through touch (sensing the ground surface with a white cane) and hearing (identifying and localising events and landmarks through sound). Mobility challenges can be summarized in four main problems: avoiding objects or obstacles (e.g., pedestrians, tree branches, improperly parked cars); detecting ground level changes (e.g., stairs, pavement edge or incline); negotiating street crossings (e.g., lack of curbs, traffic lights or sound signalling); and adapting to light variation (e.g., abrupt changes between different environments) [13, 24]. Although these problems generally diminish with increased experience of an environment, they still make travelling in unfamiliar settings particularly challenging, often preventing VIP from going outdoors altogether.
Despite a significant amount of research on understanding the perceptual and neurocognitive mechanisms by which people with sight loss access and process wayfinding information [8], there is still little practical knowledge of how the management of mental load and stress relates to the wayfinding process itself. This is a critical aspect of designing mobility technologies that has only recently been considered essential in developing an understanding of how environmental factors affect the cognitive-emotional states of the VIP [27]. Two studies in the early 1970s suggested that some form of psychological rather than physical stress is responsible for increased heart rate in visually impaired versus sighted pedestrians [21, 28]. More recently, examination of electrodermal activity [18] and electroencephalography [19] signals recorded from VIP during outdoor travel has shown that they experience psychological stress when walking on busy shopping streets, passing through large open areas, and crossing junctions.
Electrodermal activity (EDA) and heart rate (HR) are well-known indicators of physiological arousal and stress activation in affective computing and human-computer interfaces [5, 25]. EDA is more sensitive to emotion related variations in arousal as opposed to physical stressors, which can be better reflected in the HR signal. Measurements of blood volume pulse (BVP), originally used to monitor HR, can also reflect transient processes in arousal and cognitions [22]. Electroencephalography (EEG), on the other hand, can provide neurophysiological markers of cognitive-emotional processes induced by stress and indicated by changes in rhythmic patterns of brain activity [15, 16].
Taking advantage of the inherent and complementary properties of the EEG, EDA and BVP signals, this paper presents a multimodal approach to automatic inference of environmental conditions affecting VIP when navigating outdoors using a random forest classifier and features extracted from the three signals. The goal of the study was to discover biomarkers that can be used to detect shifts in emotional stress and cognitive load between different urban environments and situations. Aligning this information with GPS coordinates, we further studied the relationship of specific biomarkers with the environmental/situational factors that evoked them.
2 Design and Materials
A route was charted in the city centre of Reykjavik in Iceland (see Fig. 1) with the assistance of caretakers and O&M instructors to take the VIP through situations where different levels of stress were likely to occur. Accordingly, the route comprised eight distinct urban environments representable of a variety of mobility challenges, which can be grouped in three higher-level categories (see Table 1). The route was approximately 1 km long and took on average 13 min 44 s to walk (range = 9–19 min).
Eight VIP with different degrees of sight loss participated in the study (5 female; average age = 39 yrs, range = 22–51 yrs; relevant demographic characteristics are reported in Table 2). To help make them feel comfortable and safe, they were encouraged to walk as usual using their white canes and were accompanied by their familiar O&M instructor. Participants reported having no general health issues. They were instructed to avoid smoking normal or e-cigarettes and consuming caffeine or sugar (e.g., coffee, coke, chocolate) approximately 1 h prior to the walk. Recruitment was based on volunteering and all VIP were capable of giving free and informed consent. The study was approved by the National Bioethics Committee of Iceland. All data was anonymized before analysis.
EEG was recorded using the Emotiv EPOC+, a mobile headset with 16 passive electrodes registering over the 10–20 system locations AF3, F7, F3, FC5, T7, P3 (CMS), P7, O1, O2, P8, P4 (DRL), T8, FC6, F4, F8, and FC4 (sampling rate \(f_s = 128\) Hz). Given the practical constraints involved in an outdoor mobility study, EPOC+ was chosen because it provides a good compromise between performance (i.e., number of channels and scientific validity of the acquired EEG signals) and usability (i.e., outdoor portability, preparation time and user comfort) with respect to other commercial wireless EEG systems [1, 9–11].
Along with the Emotiv headset, participants were asked to wear the Empatica E4 wristband [12]. E4 measures the EDA signal through 2 ventral (inner) wrist electrodes (\(f_s = 4\) Hz) and the BVP through a dorsal (outer) wrist photoplethysmography (PPG) sensor (\(f_s = 64\) Hz). The wristband also includes an infrared thermopile sensor and a 3-axis accelerometer. E4 is currently the only commercial multi-sensor device developed based on extended scientific research in the areas of psychophysiology and physiological computing. Additionally, it has a cable-free, watch-like design, which makes it easier and more aesthetically pleasant to wear, and thus better fitted to use in outdoor measurements as compared to other wearable devices. Participants were asked to wear the wristband on the non-dominant hand to minimize motion artifacts related to handling the white cane [5].
Participants walked the charted route twice for training purposes. Directions were only provided during the first walk to help the VIP familiarize with the route. They were instructed to avoid unnecessary head movements and hand gestures as well as talking to their O&M instructor unless there was an emergency. Video and audio were registered by means of a smartphone camera to facilitate data annotation (observing behaviours across the different urban environments) and synchronization (start/end of walk, urban environments and obstacles). In addition, GPS coordinates were logged via a Garmin GPSMAP-64s unit at a rate of 1 registration per second. At the end of the second walk, participants were asked to describe stressful moments along the route.
3 Data Analysis and Experiments
The goal of the data analysis was to explore features and markers from the collected brain and body signals which can be used to detect cognitive load and stress in humans during outdoor physical activity. While the relationship between unimodal physiological signals and psychological arousal has been studied extensively, the detection of stress from fusing multimodal biosignal streams has not been comparatively investigated. Specifically, the analysis focused on EEG (all 14 channels), EDA, and BVP data.
3.1 Signal Processing and Feature Extraction
The Emotiv EPOC+ system involves a number of internal signal conditioning steps. Analogue signals are first high-pass filtered with a 0.16 Hz cut-off, pre-amplified, low-pass filtered with a 83 Hz cut-off, and sampled at 2048 Hz. Digital signals are then notch-filtered at 50/60 Hz and down-sampled to 128 Hz prior to transmission. In this study, the EEG data obtained from the headset was time-domain interpolated using the Fast Fourier Transform (FFT) to account for missing samples due to connectivity issues. Interpolated signals were then normalized to decrease inter-individual variance. For each of the 14 channels, the power spectral intensity (PSI) [23] in each of the \(\delta \)(0.5–4 Hz), \(\theta \)(4–7 Hz), \(\alpha \)(7–12 Hz), and \(\beta \)(12–30 Hz) bands was computed using the PyEEG open source Python module [2]. The PSI of the kth band is defined as
where \(f_s\) is the sampling rate, N is the time series length, \(|X_1, X_2, \ldots , X_N|\) is the FFT of the series, and K is the total number of bands. In total, 56 EEG features were computed.
A measurement of skin conductance (SC) is characterized by two types of behaviour: short-lasting phasic responses (SCRs; can be thought of as rapidly changing peaks) and a long-term tonic level (SCL; can be thought of as the underlying slow-changing level in the absence of phasic activity). Another characteristic is the superposition of subsequent SCRs (i.e., one SCR emerges on top of the preceding one), typically observed in states of high arousal [5]. Skin conductance data obtained from the E4 was first low-pass filtered (1st order Butterworth, \(f_c = 0.6\) Hz) to remove steep peaks stemming from artifacts and subsequently min-max normalized to reduce inter-individual variance [7]. Conditioned SC signals were then decomposed into continuous components of phasic and tonic EDA using a deconvolution-based method implemented in Ledalab, a Matlab based toolbox [4]. Six features were extracted: number of SCRs (hereinafter SCRs), sum of their amplitudes (AS), average phasic EDA (PA), maximum phasic EDA (PM), time-integrated phasic EDA (ISCR), and mean tonic EDA (TonicMean).
The BVP signal recorded by the E4 PPG sensor is preprocessed on board using a proprietary motion artifact removal technique [12]. No further conditioning was implemented and the reported data (i.e., BVP amplitude) was used directly as a feature of cardiovascular activity.
3.2 Classification Design
In order to identify automatically the affective meaning of an urban space based on biosignals recorded from VIP walking through it, we postulated the study as a supervised classification process. A widely-used ensemble learning method for classification was employed, namely Random Forest (RF) classifier [6], selected due to its ability to deal with possibly correlated predictor variables as well as because it provides a straightforward assess of the variable importances. For each of the distinct environments described in Table 1, each time point of the corresponding biosignal data was annotated based on a binary schema per second, where “1” signalled the presence of the participant in the given environment at the given time point and “0” otherwise.
A series of experiments were designed to assess and compare the predictive power of each modality (EEG, EDA or BVP) as well as of their fusion in a feature-level basis, in both single-class and multi-class scenarios (see Table 3). The adjustment of the two most important parameters of RF was performed by means of grid search parameter estimation with 5 fold cross validation. We exploited the effect of the number of estimators [150, 300, 600] as well as the effect of the maximum number of features \([ .5, 1, 2] *\sqrt{\text {NumberOfFeatures}}\). Overall, the optimum number of estimators was 300 and the maximum number of features was set equal to the total number of features for each experiment.
For each experiment we estimated the relative rank (i.e. depth), as emerged from the “Gini” impurity function, of each feature in order to assess the relative importance of that feature to the predictability of the target variable [6]. We trained one model for each of the single-class cases and one for the multi-class experiment following a 5 folds cross-validation schema, where the 80 % of the data points were used for training and the 20 % for testing, with data shuffling in order to avoid dependencies in consecutive data points. The best model is chosen as the one that maximised the area under of the receiver operating characteristic (AUROC) weighted statistic, taking into account the lack of balance between the labels.
3.3 Results
Table 4 summarises the AUROC weighted metric for all the experiments. Both modalities (Exps. I–III) are predictive of the distinct environments, however, the fusion of the two modalities gave particularly high results, not only in the one-versus-all scenario (Exp. IV) but also in the multi-class classification (Exp. V). Figure 2a depicts the weighted ROC curves of the latter in an one-against-all binary scenario, assessing the qualitative performance of each class. Interestingly, we note that the model performs equally well for all classes showing proof of its stability.
Figure 2b depicts the ten most predictive features of Exp. V. The feature importances were estimated also for all experiments and the most predictive ones appear always with the highest ranks. Interestingly, we note that the features related to skin conductance are the most predictive, with spectral power of the \(\beta \) brainwaves further dominating predictions. Although real-time EEG acquisition may be subject to very noisy signals, this finding is in line with the neuroscientific literature. A recent study on cognition and cortical activity after mental stress demonstrated that low amplitude beta waves with multiple and varying frequencies are often associated with active, busy, or anxious thinking and active concentration [3]. Another study confirmed that in subjects with high stress both baseline EEG (low frequency wave) and EEG during a stressful task (high frequency wave) were beta waves [16]. Theta waves were also observed during the stressful task and attributed to frustration and disappointment. This finding is in line with the fourth most important feature in the multi-class classification, which is a \(\theta \) wave.
3.4 Visualising Biomarker Density Distributions
To better understand the properties of the most predictive features that emerged from the classification experiments as well as the intensity of the cognitive-emotional responses they express, we assigned feature values to pairs of latitude and longitude coordinates based on recorded timestamps and assessed their geographical distributions by means of weighted kernel density estimation.
The recorded GPS traces were subject to noise due to our request for high sampling rate (1 Hz), therefore each trace was corrected by its Euclidean projection onto a reference route. The high sampling rate allowed us to immediately observe increased concentrations of GPS points when the VIP had to cross a main road (environment F, see Table 1), pass along parked cars in a narrow alley after the urban park (C), walk up and down stairs (E), or pass through a narrow area between construction works (H). In fact, these are the same situations reported as stressful by the participants themselves at the end of the study. Geographic information methods offer great promise in objectively measuring and studying the relationship of biomarkers to human behaviour in terms of physical and transport-related activity.
Let \( \{ \mathbf {x}_1, \mathbf {x}_2, \ldots , \mathbf {x}_n \} \) be an independent random sample drawn from some distribution with density function \(f(\mathbf {x})\) defined on \(\mathbb {R}^d\). The (multivariate) weighted kernel density estimate of f is defined in [26] as:
where K is a kernel function, \(H > 0\) is a symmetric \( d \times d \) matrix which controls the bandwidth (or smoothing) of the estimate, , and w is a function weighting each data point in the sample with a value from \(\mathbf {w} \in \mathbb {R}^m,\, m \le d \). A popular choice for K is the Gaussian (or normal) kernel, which was also applied here.
The three most predictive features were mean tonic EDA (TonicMean), number of SCRs (SCRs) and the sum of their amplitudes (AS). For each of them, using the values as weights (w with \(m = 1\)) for GPS coordinates (\(\mathbf {x}\) with \(d = 2\)) and a bandwidth of \(H(\mathbf {x}) = 0.0008\), helped estimate the feature-weighted density of GPS points on a \(500 \times 500\) grid, and based on this generate a contour plot for each participant. Figure 3 shows the resulting contours aggregated for all participants and plotted on top of an OSM map (the darker the colour, the higher the density of the distribution). Locations of increased stress-elicited arousal along the different urban settings of the route are clearly illustrated.
4 Conclusions
This study presents a framework for assessing the emotional experience of people with sight loss, while navigating in unfamiliar outdoor environments based on ambulatory monitoring and fusion of multimodal biosignal data. Different urban scenarios were compared, aiming to address the robustness of the model as well as emerging differences in the perception and interaction of the VIP with their surroundings. The high prediction rate (93 % AUROC weighted) is highly encouraging of this approach and, interestingly, the most predictive features of stress and cognitive load indicate as stressful “hotspots” (Fig. 3) scenes that coincide perfectly with the self-reported stressful situations experienced by the participants.
Among the limitations of the study is of course the recording precision of the mobile EEG headset as well as the limited number of participants which does not allow for an in depth analysis of specific stressors in each category of sight loss. Moreover, even if the city of Reykjavik does not present the complexity of big metropolitan areas, the charted route was designed in order to combine some of the busiest streets and most challenging settings reported by the VIP.
Future steps of this research study includes a refinement of the predictive model, extending the categories according to Table 1, as well as expanding to indoor navigation scenarios. Such findings hopefully pave the way to mobile technologies that take the concept of navigation one step further, accounting not only for the shortest path in an urban route but also for the less stressful and safer one.
References
Badcock, N.A., Mousikou, P., Mahajan, Y., de Lissa, P., Thie, J., McArthur, G.: Validation of the emotiv EPOC EEG gaming system for measuring research quality auditory ERPs. PeerJ 1:e38 (2013)
Bao, F.S., Liu, X., Zhang, C.: PyEEG: an open source Python module for EEG/MEG feature extraction. Comput. Intell. Neurosci. 2011, e406391 (2011)
Baumeister, J., Barthel, T., Geiss, K., Weiss, M.: Influence of phosphatidylserine on cognitive performance and cortical activity after induced stress. Nutr. Neurosci. 11(3), 103–110 (2008)
Benedek, M., Kaernbach, C.: A continuous measure of phasic electrodermal activity. J. Neurosci. Methods 190, 80–91 (2010)
Boucsein, W.: Electrodermal Activity, 2nd edn. Springer, New York (2012)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Cacioppo, J.T., Tassinary, L.G.: Inferring psychological significance from physiological signals. Am. Psychol. 45(1), 16–28 (1990)
Cattaneo, Z., Vecchi, T., Cornoldi, C., Mammarella, I., Bonino, D., Ricciardi, E., Pietrini, P.: Imagery and spatial processes in blindness and visual impairement. Neurosci. Biobehav. Rev. 32, 1346–1360 (2008)
David, H.W., Whitaker, K.W., Ries, A.J., Vettel, J.M., Cortney, B.J., Kerick, S.E., McDowell, K.: Usability of four commercially-oriented EEG systems. J. Neural Eng. 11, 046018 (2014)
Debener, S., Minow, F., et al.: How about taking a low-cost, small, and wireless eeg for a walk? Psychophysiology 49, 1449–1453 (2012)
Ekandem, J.I., Davis, T.A., Alvarez, I., James, M.T., Gilbert, J.E.: Evaluating the ergonomics of BCI devices for research and experimentation. Ergonomics 55, 592–598 (2012)
Garbarino, M., Lai, M., Bender, D., Picard, R.W., Tognetti, S.: Empatica E3 - a wearable wireless multi-sensor device for real-time computerized biofeedback and data acquisition. In: EAI 4th International Conference Wireless Mobile Communication Healthcare (Mobihealth), pp. 39–42 (2014)
Geruschat, D.R., Smith, A.J.: Low vision for orientation and mobility. In: Wiener, W.R., Welsh, R.L., Blasch, B.B. (eds.) Foundations of Orientation and Mobility. History and Theory, vol. I, 3rd edn. AFB Press, New York (2010)
Giudice, N.A., Legge, G.E.: Blind navigation and the role of technology. In: Helal, A., Mokhtari, M., Abdulrazak, B. (eds.) The Engineering Handbook of Smart Technology for Aging, Disability, and Independence, pp. 479–500. John Willey & Sons, Hoboken (2008)
Hosseini, S.A., Naghibi-Sistani, M.B.: Classification of emotional stress using brain activity. In: Gargiulo, G.D., McEwan, A. (eds.) Applied Biomedical Engineering, pp. 313–336. InTech, Rijeka (2011)
Jena, S.K.: Examination stress and its effect on EEG. Int. J. Med. Sci. Public Health 11(4), 1493–1497 (2015)
Marston, J.R., Golledge, R.G.: The hidden demand for participation in activities and travel by persons who are visually impaired. J. Vis. Impairment Blindness 97(8), 475–488 (2003)
Massot, B., Baltenneck, N., Gehin, C., Dittmar, A., McAdams, E.: EmoSense: an ambulatory device for the assessment of ANS activity–application in the objective evaluation of stress with the blind. IEEE Sensors J. 12(3), 543–551 (2012)
Mavros, P., Skroumpelou, K., Smith, A.H.: Understanding the urban experience of people with visual impairments. In: Proceedings of GIS Research UK 2015, pp. 401–406. Leeds, 15–17 April 2015
Millar, S.: Understanding and Representing Space: Theory and Evidence from Studies with Blind and Sighted Children. Clarendon, Oxford (1994)
Peake, P., Leonard, J.A.: The use of heart rate as an index of stress in blind pedestrians. Ergonomics 14(2), 189–204 (1971)
Peper, E., Harvey, R., Lin, I.M., Tylova, H., Moss, D.: Is there more to blood volume pulse than heart rate variability, respiratory sinus arrhythmia, and cardiorespiratory synchrony? Biofeedback 35(2), 54–61 (2007)
Quiroga, R.Q., Blanco, S., Rosso, O.A., Garcia, H., Rabinowicz, A.: Searching for hidden information with Gabor Transform in generalized tonic-clonic seizures. Electroencephalogr. Clin. Neurophysiol. 103, 434–439 (1997)
Quiñones, P.A., Greece, T.C., Yang, R., Newman, M.W.: Supporting visually impaired navigation: a needs-finding study. In: ACM CHI Conference Human Factors Computing Systems, pp. 1645–1650. Vancouver, BC, 7–12 May 2011
Turner, J.R.: Cardiovascular Reactivity and Stress: Patters of Physiological Response. Springer, New York (1994)
Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman & Hall, London (1995)
Welsh, R.L.: Improving psychosocial functioning for orientation and mobility. In: Wiener, W.R., Welsh, R.L., Blasch, B.B. (eds.) Foundations of Orientation and Mobility. Instructional Strategies and Practical Applications, vol. 2, 3rd edn. AFB Press, New York (2010)
Wycherley, R.J., Nicklin, B.H.: The heart rate of blind and sighted pedestrians on a town route. Ergonomics 13(2), 181–192 (1970)
Acknowledgments
The research leading to these results has received funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 643636 “Sound of Vision.” The authors wish to thank the administration and O&M instructors at the National Institute for the Blind, Visually Impaired, and Deafblind in Iceland for their valuable input and generous assistance.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Saitis, C., Kalimeri, K. (2016). Identifying Urban Mobility Challenges for the Visually Impaired with Mobile Monitoring of Multimodal Biosignals. In: Antona, M., Stephanidis, C. (eds) Universal Access in Human-Computer Interaction. Users and Context Diversity. UAHCI 2016. Lecture Notes in Computer Science(), vol 9739. Springer, Cham. https://doi.org/10.1007/978-3-319-40238-3_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-40238-3_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40237-6
Online ISBN: 978-3-319-40238-3
eBook Packages: Computer ScienceComputer Science (R0)