The effects of traveling in different transport modes on galvanic skin response (GSR) as a measure of stress: An observational study

Background: Stress is one of many ailments associated with urban


Introduction
Cities are the centre of multiple health and social challenges. Noncommunicable diseases (NCDs), for example, kill 41 million people every year, and are driven by factors associated with urbanization such as physical inactivity, poor diet, and air pollution (WHO 2019). Cities, however, also provide solutions to many such ailments through the creation of wealth, boosting of creativity, and opportunities for sustainable living (Bettencourt and West 2010). Population densities found in cities enable intense use of infrastructure, theoretically minimizing the need for private motorized transport and enabling alternatives such as walking and cycling. Many cities, however, are not planned to deliver optimally on such opportunities. Despite an increased recognition of multiple benefits of active travel policies (de Nazelle et al. 2011;Mueller et al. 2015), few cities have developed ambitious programmes for its promotion. Active travel in particular is understood to simultaneously offer a viable non-polluting transport option and a convenient, and economically affordable means of integrating physical activity into daily lives . It is space-efficient compared to driving, and thus can liberate public space for other beneficial uses such as greenspace. Questions remain, however, on whether cycling can help tackle what is quintessentially associated with city life and a recognized determinant of poor health (Seiler et al. 2020).
With fear of traffic ranking as one of its top deterrents (Heinen et al. 2010;Winters et al. 2017), experiences of stress can be expected from urban cycling. One study indeed found that specific conditions within cycling journeys cause 'peak stress', such as motorized vehicles moving too close to cyclist or turning in front of cyclists (Caviedes and Figliozzi 2018), whereas Teixeira et al. (2020) found that physical segregation between cyclists and vehicles reduces the likelihood of stress. Another showed that travelling on local roads lowered levels of stress compared to collectors and arterials (Fitch et al. 2020). Recent studies have indicated, however, that active travel could in fact be associated with lower levels of stress in general (Avila-Palencia et al. 2018, and significantly higher happiness ratings (Zhu and Fan 2018;Fan et al. 2019). Recent research has also shown that cyclists tend to experience positive perceptions and affects during their journeys, for example as measured by: satisfaction with the work commute (Olsson et al. 2013;St-Louis et al. 2014), commute-time relaxation and excitement (Gatersleben and Uzzell 2007), journey-based affect and stress (LaJeunesse and Rodríguez 2012), self-perceived stress upon arrival at work (Gottholmseder et al. 2009), and higher positive or uplift of mood (Glasgow et al. 2019, Lancée et al. 2017. With exception of the Caviedes and Figliozzi (2018) and Fitch et al. (2020) analyses, these previous studies, however, relied on selfreported stress or related measures, rather than an objective measure of stress. Moreover, most studies used cross-sectional designs, and none made any attempt to randomize the treatment (i.e., cycling). Thus, and despite ample adjustments for multiple confounders in most cases, establishing the direction of influence (causation) has not been possible so far.
The galvanic skin response (GSR), a measure of continuous variations in the electrical conductance of skin, has been used in previous studies as an objective measure of stress (Helander 1978;Labbé et al. 2007;Hernandez et al. 2011). The skin conductance is understood to vary with the state of sweat glands in the skin, itself regulated by the Autonomic Nervous System (ANS) -arousal of the sympathetic branch of ANS influences sweat glands to produce more sweat which in turn increases skin conductivity (Navea et al. 2019). Therefore, the measurements of GSR have also been used to indicate psychological or physiological arousal (Zhai and Barreto 2006;Kelly and Jones 2010). GSR measurements can be made continuously, in a way that is nonintrusive and non-burdensome, using wearable sensors, such as the BodyMedia Sensewear (Laeremans et al. 2017). GSR sensors are thus ideal for the study of stress responses to individuals' daily routines in their own real-world setting, such as during travel.
Although randomized controlled trial (RCT) approaches may be ideal to establish causation, an out-of-routine experiment would be both costly and remove the realities of exposures and responses during daily lives. As an appropriately designed natural experiment to address potential sources of bias would be nearly impossible in this context, a best alternative is to use a statistical framework that expressly balance the data before assessing treatment effects (Guo and Fraser 2014). Propensity score matching (PSM) is such an approach, used to reduce bias by assembling a sample in which confounding factors are balanced between treatment groups (Morgan 2018) and meant to approximate RCTs and to help infer causation (Schneider and McDonald 2010).
We aimed to investigate the impacts of travel mode choices on objectively-measured stress through the use of GSR measurements, using propensity score matching (PSM), which matches observations for subjects who share similar distributions of observed baseline covariates. We compare different groups of mode users one single transport mode at a time (considered as the treatment group), matching observations of participants travelling on that transport mode with a corresponding control group consisting of observations of participants not in the same transport mode (i.e., the control group can include observations in non-transport activities or in other transport modes). As a result, all measured covariates are balanced across treatment and control groups so that we are able to estimate treatment effects in a way that mimics those reported in a RCT with the benefit of reduced confounding, because conditional on the propensity score, the differences in the outcomes between the treatment and control groups can be essentially attributed to the treatment effects (Li et al. 2019). Regression analysis then enables us to estimate the causal effects of using a single transport mode on stress measured by GSR. We mainly focus on whether individuals will feel less stressed while they are cycling compared to while they are not cycling, and repeat the analysis for motorized travel and walking.

Overview
Propensity scores matching (PSM) can be used as a quasiexperimental method (Jones and Lewis 2015) to identify the impact of a particular intervention, or event (a "treatment") in a study that is not feasible to randomize. We establish total times (minute by minute observations) while cycling as our treatment group, and use PSM to artificially construct a control group randomly based on observed characteristics. The matching process involves the development of a logit model to explain treatment assignment, which is then used to identify controls with similar matching characteristics. Note that the control group contains observations taken during any activity other than cycling, so it may include (but is not restricted to) other travel activities and may include observations from individuals who cycle at other times of the day. We verify adequate matching comparing the distribution of confounders across the two groups. We then compare our outcome measure, GSR, across the two groups using a linear mixed model, and compute the average treatment effect as an overall assessment of impacts of cycling on stress. We repeat the analysis for walking, and then for the combination of private and public motorized travel.

Study population and design
We use data from a panel study on health and transport in three large European cities (Antwerp, Barcelona, London) as part of the Physical Activity through Sustainable Transport Approaches (PASTA) project (Gerike et al. 2016;Dons et al. 2015). In this component of the EU project, 122 adults (41 in Antwerp and Barcelona, 40 in London) were recruited from the pool of respondents of a larger longitudinal survey who answered positively to taking part in a sensor-based sub-study. Only non-smoking adults between ages of 18 and 65 and with a BMI lower than 30 were selected for the sub-study, and we aimed for a balanced sample of males and females with a range of physical activity patterns. For one full week, participants were asked to wear a GPS monitor and a 'SenseWear' device (among other sensors not related to this current paper) and to fill in a comprehensive baseline questionnaire, including a one-day travel activity diary. This measurement week was repeated three times, in Winter, Summer and Spring/Fall. In addition to GSR measurements, the SenseWear records data such as skin temperature and physical activity level (as one-minute average METs). The questionnaire data provide participants' socio-demographic characteristics and travel habits. Travel time and mode identification were derived from the combination of physical activity, GPS and diary data. More details can be found in Avila-Palencia et al. 2019;Laeremans et al. 2018a;Laeremans et al. 2018b.

Outcome
GSR, measured minute by minute on our participants throughout their three weeks of participation in the study, is our response variable. GSR, which refers to changes in sweat gland activity that are associated with the intensity of emotional arousal (Kyriakou et al. 2019), is regarded as a proxy of overall stress level once confounders such as physical activity and temperature are accounted for (Sarker et al. 2016).

Treatment variable
We first establish the "treatment" variable used in the propensity score matching (PSM) approach. Methods will be explained with cycling as the treatment variable, but we will also show results on subsequent analyses with motorized travel (public and private) and walking as respective treatment groups. We used a bespoke mode detection algorithm to classify mode choice, as built-in activity detection based solely on accelerometery in the SenseWear device did not show high accuracy when compared to the travel activity diaries. The algorithm (summarized in Appendix Section 1) combined activity diary, GPS, and activity detection data from the SenseWear to identify 4 distinct travel modes (1 = walking; 2 = cycling, 3 = motoring (i.e., car, taxi, motorcycle, public transport), 4 = others) and a stationary state (5 = stationary). As we want to compare the overall stress of participants while cycling to that while not cycling, the treatment variable in our potential outcomes framework is a dichotomous variable: observation from individual participants who are cycling (=1) or observation from individual participants who are not cycling (=0).

Other explanatory measures
Several variables that may influence GSR were recorded in the SenseWear: i.e., near-body temperature (average near body ambient temperature per minute), METs (Metabolic Equivalents of Task, the ratio of metabolic rate (the rate of energy consumption) during a specific physical activity to a reference metabolic rate), and heatflux (a measure of the amount of energy being dissipated by the body to its surroundings as convective heat per unit area). The near-body temperature sensor is attached to the heat flux sensor and is the temperature on the outer side of the SenseWear. Metabolic Equivalents of Task (METs) are calculated through a proprietary algorithm using the data from the accelerometer on the SenseWear. Heatflux is derived through a thermally conductive sensor in the SenseWear between the skin at the point of contact with the SenseWear and the immediate surroundings of the device.
The SenseWear also recorded the date and time of the observations of each participant, which we categorized into three groups: morning peak (6am-9am), afternoon peak (4pm-7pm) and non-peak hours. All participants were asked to fill in the PASTA baseline questionnaire so that the date of birth, sex, educational level, income status, smoke or not, weight and height of each individual were recorded. For each participant age was computed from the date of birth and body mass index (BMI) from self-reported weight and height. The rest of the variables retrieved from the questionnaire were regarded as categorical variables. For example, sex is a variable with two categories (male and female), as is education (secondary education and higher/university education), whereas income is represented by seven categories (from '<€10,000' to '≥€150,000'). These variables are subject-level baseline covariates and they likely affect both treatment assignment (choice of a single transport mode) and the outcome (GSR). Therefore, it is appropriate to include them in PSM (Austin 2011). Furthermore, to estimate treatment effects with greater precision when doing PSM, we also need to take into account variables that do not affect treatment assignment but that affect the outcome (Austin 2007;Brookhart et al. 2006). In our case, together with the aforementioned baseline covariates, variables that may influence GSR and are collected by the SenseWearnear-body temperature and METs are also considered in the PSM stage (heatflux is excluded from this stage because it is correlated with near-body temperature as shown in Appendix Fig. S3-1).

Propensity score matching and linear mixed models
Details of PSM procedures are shown in Appendix Section 2. In short, the PSM method entails identifying a sample of control observations to match observations in the treatment group (here while cycling/walking/ motoring as treatment vs while not cycling/not walking/not motoring as control), based on a set of independent covariates that explain both treatment (cycling/walking/motoring) and outcome (GSR). A logistic model is developed using the MatchIt R-package (Ho et al. 2007;Ho et al. 2011) (see Appendix Section 1.2), in order to calculate the propensity scores used to match treatment and control samples, after excluding highly correlated covariates to avoid multicollinearity (see Appendix Section 2.1). Once a matched sample is constructed, standardized mean difference (SMD) and other diagnostics are used to verify whether covariates are balanced between treatment and control groups, that is, whether the distribution of those covariates are the same between treated and untreated groups. Based on the matched sample, linear mixed models (LMMs) are then derived to predict the outcome in order to evaluate the average treatment effect on the treated population (ATT). The ATT is established by averaging the effect of a (single) transport mode on stress over those participants who are using this transport mode. The LMMs used in our study are outcome regression (OR), inverse probability weighting model (IPWM), and augmented regression (AR) -all with GSR as the response variable. The OR do not make use of propensity scores, while IPWM utilizes the inverse of the propensity scores as weights so that weighted least squares (WLS) can be employed to estimate the outcomes. These weights are also known as inverse probability weights (IP weights). AR also uses the IP weights but considers them as a covariate in the model. To account for the repeated measures design of our data collection, we use individuals as random effects. We use the user ID, a factor containing a unique ID number for each individual, as the random effect variable grouping all personal factors otherwise included in baseline characteristics (e.g., age, gender, city). The models include activity-specific characteristics (treatment variable, near-body temperature, METs, travel period) to reduce possible residual imbalance or remaining confounding (Nielsen 2016;Stuart 2010). Individual characteristics obtained that are shared by all participants from the baseline questionnaire (gender, age etc.) are also included in the models as fixed covariates. The activity-specific characteristics are included in both fixed and random effects parts of LMMs since they have subject-specific effects that are unique to a particular participant. Specifying individuals as random effects with random slopes for each participant impacted by activity characteristics in the LMMs (as each person is assumed to have different baseline of activity characteristics) allows us to incorporate person-specific variability in the GSR because each participant has their own unique "curve" that describes longitudinal change in the response (Fitzmaurice and Ravichandran 2008).

Description of the study sample
As detailed in previous analysis, our sample of 122 participants was almost equally distributed over the three cities, was relatively young (35 year old average), slightly more female (55%) than male, in large majority having achieved higher education levels (89%), and mostly fully employed (72%) (see Avila-Palencia et al. 2019), and more than half of participants (69.7%) have an annual household income of more than €25,000 given the information we have (participants who did not share information of their income were recorded as unknown).

Cycling analysis
3,258,656 minute-based observations were available from the pool of participants, averaging 18.5 days per participant. From these, the matching process for the cycling analysis yielded a final sample of 102,288 minute-by-minute observations, equally divided among the treatment and control groups, with 51,144 observations each. The distribution of propensity scores between treatment and control groups before and after PSM demonstrates a successful matching process, the details of which are shown in Appendix Section 3.2. As expected, our final sample was well balanced with regards to selected covariates across control and treatment groups, including near-body temperature, average METs and sex (Appendix Table S3-2). In addition, we measured an average (SD: standard deviation) GSR in our treatment group of 0.18 (0.27), versus 0.22 (0.37) in the control group. The control group consisted of 2113 observations while walking, 4051 while motoring, 44,361 while being 'stationary', and the rest of observations (6 2 0) belong to unknown travel mode (i.e., while travelling but mode could not be recognized).
Our final sample of observations after matching has similar characteristics as our initial pool of participants in terms of age, BMI, education, and gender. Where they differ most markedly is in the distribution across the three cities, with now 46% of the sample observations in Antwerp, and the rest evenly distributed between Barcelona and London (see detailed distributions in Appendix Section 8).
The distribution of the outcome variable is heavily skewed to the left (as shown in Fig. 1), therefore we use the log transformation of GSR as our response variable in the three LMMs developed to evaluate the ATT. Taking the log of GSR make residuals closer to a normal distribution and reduce the heteroscedasticity, improving the fit of the model. In all three fitted models all covariatesexcept non-peak hours cycling period, cycling in Barcelona and some individual-level characteristics such as health, smoke or not, income status and educational levelare shown to be significant at the p < 0.05 level in explaining the log of GSR (Table 1). Across all three models, cycling is shown to decrease log(GSR) compared to not cycling (at that moment), accounting for physical activity levels, time of day and near-body temperature. As expected METs and nearbody temperature are also strongly positively associated with log (GSR). Compared to the morning peak, cycling during the afternoon peaks showed higher log(GSR), and no statistically significant difference with off-peak travel. It appears that cycling in Barcelona and London led Fig. 1. Distributions of GSR for three single transport modes (walking cycling, and motoring) before and after log transformation.
to lower log(GSR) compared with cycling in Antwerp, but the difference in log(GSR) between cycling in Barcelona and Antwerp was not statistical significant. From the table, it also showed that females while cycling generally have lower log(GSR) than males while cycling.
Estimating ATT for participants with matched observations by their nearest propensity scores (using Eq. (S.4) in Appendix Section 2.1) indicates that participants would feel less stressed while cycling compared to not cycling as the GSR is 10.90% lower while cycling compared to not, when using the OR model. This effect is attenuated to − 5.72% when applying the IPWM model, and − 11.07% when using the AR model ( Fig. 2 and Table S6).

Walking and motorized travel analysis
The walking and motorized travel analyses showed similarly successful matching diagnostics as the cycling analysis (Appendix Table S4-1 and S5-1). Walking was shown to result in a statistically significant reductions in stress level, with p < 0.05 in the OR and AR models, and p < 0.1 in the IPWM model (Appendix S4-2). Taking motorized (public or private) transport led to a significant increase in stress when applying the OR (p < 0.05) and AR (p < 0.1) models, but the motorized travel treatment variable did not reach a statistical significance in the IPWM model (Appendix Table S5-2). Other covariates across the models had generally similar effects as for the cycling analysis, except for the IP    Fig. 2. Treatment effects (with their 95% CIs) of three (single) transport modes vs any other activity using propensity scores matching and three regression models (outcome regression, inverse probability weighted regression, and augmented regression).
weights losing their significance in the walking analysis (AR model) (Appendix Table S4-1 and S4-2). When estimating the ATT, we find overall that walking lowers stress as GSR is reduced by − 3.94 to − 5.68%, depending on the model applied ( Fig. 2 and Appendix Table S6). Conversely, motoring (i.e., car, motorcycle, taxi, public transport) slightly increases stress levels, with effects ranging from 0.94% to 1.13%. None of 95% the confidence intervals include the 0 value, indicating overall statistically significant impacts ( Fig. 2 and Appendix Table S6).

Summary of results
We evaluated relationships between stress and travel mode, using GSR as an objective measure of stress monitored on a minute-by-minute basis for three separate weeks on a free-living population of 122 adults living in three large European cities. We used PSM to account for confounding effects by measured socio-demographic and activity level confounders. We use three regression model formulations to establish robust results, and consistently find across these models that cycling significantly lowers stress levels compared to engaging in other activities, accounting for physical activity and other confounders. For example, in the outcome regression model, cycling was shown to decrease stress levels as GSR is reduced by up to 10.51% [95% CI: 4.88-22.70%] compared to GSR while not cycling. Similarly, walking was shown to reduce stress as GSR is lowered by up to 6.24% [95% CI: 2.90-13.34%]. Traveling in motorized transport was shown, on the other hand, to be more stressful since GSR is increased by as much as 1.43% [95% CI: 0.62-3.16%].

Comparison with previous studies
Our results are largely in line with previous findings on stress-related benefits of cycling, or more generally active modes of travel, as compared to motorized modes. Many studies report general self-reported stress and wellbeing associated with mode choice rather than stress experienced during travel itself. Avila-Palencia et al. (2018) for example found significant decreases in self-reported stress and improvement in selfreported mental health for each additional day of cycling per month, but no such associations with other modes. Similar findings were found for both cycling and for associations with self-reported physical health and vitality (but none for the other modes). In a study of commuters but excluding cyclists, pedestrians were the least stressed, followed by transit users then car users (Legrain et al. 2015). Lower levels of stress in pedestrians compared to cars were in part explained by greater enjoyment of the travel experience, and mediated by their satisfaction with comfort and safety (Legrain et al. 2015). In a London-based commuter study, life satisfaction but not mental distress was significantly associated with walking or cycling vs driving, with varying results for public transport depending on type and connectivity (Chng et al. 2016).
Our approach is most akin to studies that have focused on the experiential element of travel modes, as we assessed stress experienced while cycling (or while walking /motoring). A growing body of research has found higher levels of satisfaction with their commute among cyclists and pedestrians than other modes (St-Louis et al. 2014;Gatersleben and Uzzell 2007;LaJeunesse and Rodríguez 2012;Olsson et al. 2013;Paige Willis et al. 2013;Smith 2017;Handy and Thigpen 2019;Singleton 2019). Mokhtarian et al. (2015) found that, in comparison with walking, bicycle and motorcycle trips were more often seen as pleasant, and motorized modes less often seen as pleasant. Gatersleben and Uzzell (2007) explained how emotions such as a sense of relaxation or excitement experienced during active travel may lead to more satisfactory commutes compared to other modes. Wild and Woodward (2019) found that the reasons why cyclists are the happiest commuters were due to the predictability of travel time, the pleasures of physical activity, and opportunities for casual social interactions and nature contact. Compared to cars, LaJeunesse and Rodríguez (2012) reported significantly lower journey-based self-assessed stress in pedestrians, cyclists, and bus users, and conversely higher attunement (sense of peace and relaxation) in active modes and for bus users. In an in-depth analysis of effects associated with commuter well-being, Singleton (2019) found that pedestrians and cyclists in Portland, Oregon clearly benefited from higher levels of physical and mental health, confidence, positive affect, and overall hedonic well-being than other modes. The study, however, also showed that cyclists suffered more from fear than other modes of travel, indicating how investigating deeply into psychological constructs may highlight also some detrimental aspects of active travel. In a Montreal-based study, built and natural environment features could not explain the higher levels of commute satisfaction among cyclists except for surprising positive impacts of slopes, but season mattered (Paige Willis et al. 2013). Moreover, some studies have found that contrary to other mode users, cyclists' satisfaction with their commute was unaffected by distance or congestion (Paige Willis et al. 2013;Smith 2017). Olsson et al. (2013) found that the significant positive effects on satisfaction with the work commute of pedestrians and cyclists in turn led to higher levels of happiness based on self-reported ratings of frequency and intensity for positive and negative emotions.
Our approach, however, is not directly comparable to previous work as, unlike any other studywith the exception of Caviedes and Figliozzi (2018) and Fitch et al. (2020) who assessed peak stress during cycling journeys with bio-sensorswe used stress measured during the travel journey itself, rather than self-reported journey related stress collected in recall instruments outside travel periods. This removes recall bias issues in addition to providing an objective measure of stress. The closest design in terms of temporal proximity of data collection is a study of selfreported stress collected upon arrival at work (Gottholmseder et al. 2009). The assessment concerned, however, general levels of stress rather than travel journey-related stress; travel modes failed to reach significance in the regression analysis, although travel time and predictabilitywhich can be related to travel modeswere found to be the main drivers of stress (Gottholmseder et al. 2009).
While the use of an objective measure greatly adds value to our research, it is well known, however, that GSR is an imperfect measure of stress. The activation of sweat glands can be a result of nerves responding to a variety of stimuli such as physical activity, heat, emotions and so on, which despite our best efforts of including confounders such as METs and near-body temperature, may still have had residual impacts on our effects estimates. Also, sweat glands are activated when individuals are aroused, and this arousal can also emanate from positive emotions of excitement and awe for instance. For example, as Gatersleben and Uzzell (2007) conclude from their survey-based analysis of mode choice: "Driving is relatively unpleasant and arousing, public transport is unpleasant and not arousing, cycling is pleasant and arousing, and walking is pleasant and not arousing." Clearly the term "arousal" in their paper does not have the physiologic meaning we use to interpret GSR here, but this does highlight the complexity of interpreting such data.
A related drawback of our approach is that our objective measurements also relied on identifying exact modes and times of travel to match GSR measurements. As with all such studies, the algorithm used to detect transport modes was not perfect despite efforts to make the most efficient use of the three forms of data sources (GPS, travel diary, SenseWear) to triangulate and improve prediction. Errors in travel and mode detection may have led to exposure misclassification, imprecision and bias when evaluating the causal effects.
Another difference with previous studies is that, while others compared measures of stress or satisfaction between modes, we compare stress measurements while cycling to times not cycling, with equivalent analyses for walking/motoring. The non-cycling (/non walking/non motoring) times will include times in any other activity. As these noncycling times are matched for METs among other factors, they include travel in other modes, or participation in indoor physical activity, for example (the matched control group contains 13% of observations taken while in other travel modes). The approach enables more like-for-like comparison of activity levels. It may seem less useful for direct comparison across travel modes, but the repeated analysis with walking and motoring enables a relative assessment of these modes.
An important drawback of our study is that our mode detection algorithm was not able to distinguish between different motorized modes of travel. Judging from the PASTA baseline survey questionnaire, we can expect slightly more public transport users than car users (76 participants use public transport three days a week or more, versus 49 for cars). The literature has been relatively consistent with regards to car use analysis, with a general understanding that car increases stress due to congestion, travel time and the behaviour of other road users, although some positive psychological factors have also been demonstrated, such as perceptions of autonomy, protection, power and status (Gatersleben and Uzzell 2007). The literature has been less consistent in findings of stress associated with public versus private motorized transport use, however, in part because the specific mode of public transport (e.g., bus, underground, train) seem to bring different benefits or harms (Chng et al. 2016;Legrain et al. 2015;Singleton 2019). With potentially opposite effects of public vs private motorized transport modes, our findings that motorized transport results in 2 to 4% increase in stress are thus to be interpreted with caution. As Gatersleben and Uzzell (2007) put it "the use of private cars may be too arousing (stressful), whereas the use of public transport may be not arousing enough (boring)". The PASTA study our data originated from was focused on active travel, hence less effort was put into detailing public and private motorized modes. We may safely infer that our findings support Gatersleben and Uzzell (2007)'s conclusion that "Walking and cycling, however, score positively on arousal as well as pleasure (i.e., exciting and pleasurable) and therefore seem an optimum form of travel from an affective perspective." Our study brings significant improvements to previous work as it attempts to establish causation through the propensity score matching methodology. The approach is meant to approximate a randomized controlled trial, which would be difficult to establish in the real world. A previous study was able to assess impacts of change towards active travel on wellbeing with a longitudinal design. Martin et al. (2014) used a large longitudinal panel survey in which sufficient numbers of travel mode changes towards active travel were identified to estimate its effects on wellbeing. Although causal effects are still difficult to establish with such a design as reasons for travel mode shifts are not known, their study still provides a powerful finding that switching from the car to active travel (walking or cycling) results in significant improvements in self-reported wellbeing. A much smaller panel study based in Cambridge failed to detect any significant impact of increasing or decreasing cycling to work (but not modal shifts) on physical or mental wellbeing or sickness absence (Mytton et al. 2016).

Study limitations and future research
The choice of confounders in the matching process plays an important role to produce effect of travel modes on GSR that are unbiased, however any measurement error in the confounders could introduce various biases, discussed here. For example, measurement error related to physical activity estimates in different travel modes could potentially introduce a bias in our results, given that we match on METs. The literature is relatively inconsistent on Sensewear accuracy across various activities, in particular cycling and walking (e.g., Bhammar et al. 2016;van Hoye et al. 2014;Powell et al. 2016), so we could not ascertain any systematic bias introduced in METs. If anything, the device would under-estimate rather than over-estimate METs while cycling, which would lead to a conservative estimate (i.e., under-estimate) of benefits of cycling on GSR.
We also matched on demographic and socioeconomic characteristics shared by participants as the choice of travel modes is often correlated with them and may have impacts on relationships between travel modes and the outcome. However, PSM can only account for those measured covariates that are included in the matching process, and any bias due to potentially unmeasured confounding may remain after matching (Garrido et al. 2014).
Although we have included as many factors as possible as measured confounders in this study, other environmental and personal factors certainly play a role in the relationships between stress and different travel modes. For example, trip purpose may be relevant, as seen for example in Morris and Guerra (2015) analysis of the American time Use survey showing that work-related travel had statistically significant detrimental impacts on affect (and while cycling had the most positive impacts on affect the relationship with travel modes were not statistically significant). Also, the quality and availability of mode-specific transport infrastructure, proximity to traffic, or availability of greenspace have been shown to matter in previous research (e.g., Caviedes and Figliozzi 2018;Fitch et al. 2020;Fan et al. 2011). For transit, invehicle delays have been shown to be a major source of dissatisfaction Carrel et al. (2016). Weather conditions could also affect stress (Paige Willis et al. 2013). Finally, as has been argued by De Vos (2019), attitudes may be essential elements to account for as it may be the consistency between the chosen travel mode and attitudes towards that mode that drive satisfaction with travel. Including more possible environmental, personal, or circumstantial confounders in future studies would help us further reduce the bias and thus assess the relationships of travel modes on stress more accurately.
To tackle some (but not all) of these limitations, GPS tracking data opens up further research opportunities to account for more confounders, and also to investigate the types of built environment features that can minimize stress while traveling. As a means of disentangling different sources of stress and other factors, including attitudes, that may affect bio-sensed responses such as GSR, a mixed-method approach with interviews following monitored journeys could also be developed.
We may also have introduced errors in both our treatment assignment and out outcome measurement. While we strived to improve mode detection by combining GPS and physical activity data (see Appendix Section 1), compared to self-reported activity diaries (in itself not a gold standard), our detection algorithm adequately identified only 38% of minutes in a trip in any mode and 23% of minutes cycling (Orjuela 2018). In absence of a gold standard, and as the algorithm generated equally low sensitivity (i.e., percentage of recognised trip minutes over all trip minutes reported in the diary) and positive predictive values (i.e., percentage of true trips identified over all data identified as a trip), however, it is difficult to assess whether there is any systematic bias introduced and its directionality. In addition, more detailed mode-specific analyses would also need to be carried out in the future. As we had a "unknown" travel mode category for which mode could not be detected, we conducted a sensitivity analysis excluding these 620 observations to ensure no bias would be introduced from any potential (but unlikely) systematic error in assigning travel modes. Results shown in Appendix Figure S7 and Table S6 are virtually the same. We could not resolve, however, the issue of distinguishing between private and public motorized travel modes. As these are known to have different impacts on stress, further GSR studies should investigate these as separate modes. Lastly, direct comparison between travel modes, rather than our approach of separate repeated analyses of treatment move vs other matched activities, are also needed. For example, although we observed that cycling has a stronger negative impact on GSR than walking from Fig. 2, the reason is unclear and future research may further explore relative impacts of travel modes.
As for GSR as a measure of stress, it is clear that sweat glands might be activated by positive arousal, which could lead to a higher GSR not necessarily associated to higher levels of stress. This type of arousal may be present in either the intervention or the control group, however, so we are unsure of its effects on our outcome. We do note, however, that in general GSR seems to be a good indicator of stress. In a recent literature review and comparison between laboratory and real-world conditions to measure stress (Kyriakou et al. 2019), the two studies that used only GSR as their physiological response to stress reported accuracies between 82.8% and 95% (Setz et al. 2010;Cho et al. 2017). The other studies reviewed combined GSR with other measurements such as skin temperature, leading to sometimes marginally improved accuracies (varying overall from 74.5% to 97%). Here, we have used near-body temperature as a covariate as this will be influenced by the varying degrees of physical activity in different modes. We also note that we ignored some temporal components of GSR and physiological stimuli. There are typically three phases of GSR activationlatency (the time between a stimulus and the start of the response), rising time (time between the stimulus and the peak of the response), and halfrecovery time (time from the peak value of the response to the point of 50% recovery). The time between the stimulus and 50% recovery can be between 2 and 20s (Kyriakou et al. 2019), however recovery can take longer if another stimulus happens (Christopoulos et al. 2019). In our data, the entire cycle takes place within one step of our minute-byminute analysis, so we may be missing some subtleties of arousal which would warrant more temporally-resolved analyses in the future.

Conclusions
Our study provides robust evidence that cycling can reduce stress compared to other activities in urban life. Walking also provides a significant but more modest stress benefit. By relying on an objective proxy for stress (GSR) and a robust statistical framework (PSM), this study strengthens substantially the existing literature on mental health benefits of active travel. While it seems inconsistent with the well-established literature on fear of traffic as a strong deterrent of cycling, it suggests that stress reduction benefits are felt once active travel behaviour is adoptedin other words encouraging people to experience active travel is likely to be the best way to promote it. Finding ways to reduce stress as part of daily lives is both a major challenge and an essential component of healthy living. Active travel offers a seamless solution to integrating stress reducing activities in daily routines. Our findings add to a growing and convincing body of literature making the case for cities to adopt urban land use and design strategies that will promote active travel, and for health practitioners to integrate active travel into their healthy lifestyles recommendations.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.