Validity of electrodermal activity-based measures of sympathetic nervous system activity from a wrist-worn device

Measuring electrodermal activity (EDA) on the wrist with the use of dry electrodes is a promising method to help identify person-specific stressors during prolonged recordings in daily life. While the feasibility of this method has been demonstrated, detailed testing of validity of such ambulatory EDA is scarce. In a controlled laboratory study, we examine SCL and ns.SCR derived from wrist-based dry electrodes (Philips DTI) and palm-based wet electrodes (VU-AMS) in 112 healthy adults (57% females, mean age = 22.3, SD = 3.4) across 26 different conditions involving mental stressors or physical activities. Changes in these EDA measures were compared to changes in the Pre-ejection period (PEP) and stressor-induced changes in affect. Absolute SCL and ns.SCR frequency were lower at the wrist compared to the palm. Wrist-based ns.SCR and palm-based ns.SCR and SCL responded directionally consistent with our experimental manipulation of sympathetic nervous system (SNS) activity. Average within-subject correlations between palm-based and wrist-based EDA were significant but modest (r SCL = 0.31; r ns.SCR = 0.42). Changes in ns.SCR frequency at the palm (r = − 0.44) and the wrist (r = − 0.36) were correlated with changes in PEP. Both palm-based and wrist based EDA predicted changes in affect (6.5%–14.5%). Our data suggest that wrist-based ns.SCR frequency is a useful addition to the psychophysiologist's toolkit, at least for epidemiology-sized ambulatory studies of changes in sympathetic activity during daily


Introduction
The European parliament recognizes mental health as a fundamental human right and launched the EU Action Plan on mental health for 2021-2027, which is a continuation of the World Health Organization's Mental Health Action Plan 2013-2020 (World Health Organization, 2013). A core element is the development of effective strategies for stress detection and management. In the past decade a lot of effort has been put in the development of biosensors that help identify personspecific stressors by inspection of their body's physiological responses to daily life settings (Wu et al., 2012;Carbonaro et al., 2013;Jung and Yoon, 2017;Jebelli, 2019). Sweat gland activity on the wrist is one of these physiological signals. It builds on a rich psychophysiological research tradition and recording is feasible for prolonged periods of time in daily life. The innervation of the sweat glands is entirely through sympathetic nerves and sweat gland activity is considered one of the purest measures of sympathetic nervous system activity (Critchley, 2002). The sympathetic nervous system (SNS) is rapidly activated when an individual is faced with a situation that is perceived as threatening or challenging, eliciting the so-called "fight or flight response" (Brindle et al., 2014;Jansen et al., 1995), in parallel to subjective feelings of arousal and negative affectivity often denoted as 'stress'. Activation of the SNS results in both an increase in the total number of activated sweat glands and in more secretion by the sweat ducts. These changes in sweat gland activity in turn lead to changes in the conductance of electrical activity through the skin, also denoted as electrodermal activity (EDA).
EDA is relatively easy to measure and has been used in a wide variety of research fields, notably attention, information processing, and emotion (Dawson et al., 2000). Decades of research have shown that various laboratory stressors increase Skin Conductance Level (SCL) compared to conditions of low arousal during pre-or post-stress baselines. These same stressors also systematically increase the frequency of non-specific skin conductance responses (ns.SCRs), by some referred to as spontaneous fluctuations (SF) (Bach et al., 2010). These skin conductance responses are not studied as a directly evoked response to a specific experimenter-controlled external stimulus. Rather we define ns. SCRs, consistent with Posada-Quintero and Chon (2020), to reflect "fluctuations in EDA in the presence of an ongoing sustained stimulus over a period of time" which differs slightly from , whom state that ns.SCR "occur in the absence of external stimuli and in the absence of artifacts such as movements and sighs". Frequency of ns. SCRs is measured in peaks per minute over longer time periods. Both SCL and ns.SCRs are considered indicators of SNS activity that show sensitivity to stress. Both resting levels and responses to stress of these EDA measures show relatively stable inter-individual differences  which are substantially heritable (Crider et al., 2004;Schell et al., 1988;Tuvblad et al., 2010;Wang et al., 2015). Since these EDA measures can be measured independent of knowledge on the content or timing of specific stimuli, they are in principle very suitable as indicators of SNS activity outside of the controlled laboratory environment.
The classical approach records EDA by passing a small electrical current through a pair of active/reference electrodes placed on the hand, either the middle phalanges of two adjacent fingers or the palm of the hand . These locations are preferred because the hand contains the highest density of eccrine sweat glands (Posada-Quintero and Chon, 2020). However, a practical problem facing ambulatory measurement of EDA is that the typical location for electrode placement on the fingers or the palms of the hand is quite obtrusive and interferes with daily activities. This introduces bias in the behavioral repertoires assessed and increases risk for noisy or lost signals. Another practical problem for ambulatory measurement of EDA with the classical approach is the use of wet electrodes. Wet electrodes make contact to the skin through the use of electrolyte paste . When measuring over longer periods of time the electrolyte gel may gradually spread out on the skin and hydrate the corneum . This can lead to both an increase in the recording area of the electrode (and thus observed EDA) and danger of electrode loosening. Especially the latter has a large influence on data quality and limits the length of the recording. Moreover, electrolyte paste might need to me reapplied when considering measuring over multiple days of even weeks, making it impracticable for these types of recordings.
A solution to both of these limitations is using electrodes without electrolyte paste on the wrist. Dry electrodes are generally reusable and easier to apply than wet electrodes making it a promising method to measure EDA in daily life over longer periods of time (Posada-Quintero and Chon, 2020). The wrist is a good alternative location as many smartwatches already make contact with the skin on the wrist, and these are readily tolerated for prolonged wear time. van Dooren et al. (2012) showed that measuring EDA on the wrist is indeed a good alternative to the hands and Westerink et al. (2009) and Poh et al. (2010) have shown that measuring EDA on the wrist with dry electrodes is feasible. However, while the ambulatory assessment of EDA on the wrist with dry electrodes is attractive and feasible it should be noted that these electrodes come with their own set of problems including the dependency on sufficient amounts sweat to detect EDA . Even though the study by Poh et al., 2010 showed high correlations between SCL on the wrist and fingers, the results of other studies have not been encouraging (Konstantinou et al., 2020;Milstein and Gordon, 2020;Menghini et al., 2019;van Lier et al., 2019;Kleckner et al., 2020). The evidence for ns.SCR responses rather than absolute SCL levels is more encouraging. A study by van Lier et al. (2019) showed that the mean amplitude of the ns.SCRs of dry wrist electrodes increases in a similar fashion to wet palm electrodes in response to a social stressor (sing-asong stress test). In addition, Kleckner et al. (2020) have shown that exposure to a mental arithmetic stressor and physical activity led to an increase in the detection of ns.SCR of dry electrodes on the wrist.
At present, detailed testing of the validity of ambulatory EDA remains scarce. We therefore set up a controlled laboratory study and examined construct, criterion, and predictive validity for wrist-based dry electrode EDA monitoring in response to various mental and physical stressors. We employed an existing wrist-based dry-electrode device that evolved from the Emotion Measurement platform and monitors SCL and ns.SCR frequency (DTI5, Philips Ltd., The Netherlands) and compared the wrist-based EDA measures to parallel recorded EDA measures using an active electrode on the palm of the hand (thenar eminence). Because the DTI5 uses a proprietary algorithm to extract ns. SCR frequency, we added a second scoring of ns.SCR frequency from the raw wrist-based signal that was identical to the scoring of the palmbased signal (Joffily, 2012). First, construct validity was assessed by exposing participants to known experimental manipulations of SNS activity and testing whether the wrist-based EDA measures display the expected response pattern. We hypothesized that the mean SCL and ns. SCR frequency would increase from pre-task baseline level during exposure to mental and physical stressors, and then decrease again during recovery from those stressors. Second, to test criterion validity, we compared the within-subject changes across 25 experimental conditions in wrist-based EDA to changes in the Pre-Ejection Period (PEP: the time interval between the start of left ventricular depolarization and the opening of the aortic valve). We hypothesized that mental or physical stress-induced decreases in PEP, a proven and validated measure of cardiac SNS activity (Berntson et al., 1994a,b), would be associated with increases in wrist-based EDA, indexing SNS activity on the skin. Finally, to assess predictive validity, we tested whether the changes in EDA measures predicted parallel changes in self-reported positive and negative affect induced by the mental stress tasks. We hypothesized that changes in the wrist-based EDA measures are predictive of the changes in affect induced by mental stress.

Study population
Participants were required to be between the age of 18 and 48, Dutch speakers, and currently employed, or in a schooling trajectory. Exclusion criteria were a body-mass index above 30, heart disease, high blood pressure, high cholesterol, diabetes, thyroid or liver disease, and use of antidepressants, anticholinergics, or any other medication that has been shown to influence the SNS. Female participants were measured within the first two weeks following the last day of their menstrual cycle to account for hormonal changes.
Recruitment of potential participants was done through several routes. First, advertisements were placed on the Vrije Universiteit (VU) campus and the VU participant recruitment system SONA (a cloud-based participant pool software) to recruit students and VU employees. Second, participants were recruited from the local community through social media, by advertising on a Dutch Facebook page dedicated to participant recruitment (Proefbunny) and the investigators' personal social media pages. Finally, co-workers, friends and family of the investigators, who themselves were excluded from participating, were asked to widely share the advertisement for this experiment in their social networks.
Interested participants could contact the research team through the contact information in the advertisement. During an ensuing telephone call, it was established whether the potential participant met the study criteria and was interested to receive the full information on the study. In case of a positive response, participants received the study information letter by e-mail. After a period of two weeks the research team contacted the participants and gauged their interest for actual participation in this study. After the volunteers were given complete, adequate written and oral information regarding the nature, aims, possible risks and benefits of the study, they were scheduled for the study visit at the Vrije Universiteit in Amsterdam.
Participants who were students received research credits, while other participants were compensated with a €50 gift voucher. All participants provided written informed consent before the start of the D.J. van der Mee et al.  As the "ground truth", exosomatic palmar EDA was obtained with the Vrije Universiteit Ambulatory Monitoring System (VU-AMS) on the thenar eminence with direct current. However, the classical placement of the electrodes was adjusted to fit better with the daily life character of the experimental procedures. By limiting the number of electrodes placed on the hand to one, participants had more freedom to move their hand. We considered the thenar eminence of the non-dominant hand as the least interfering placement at the hand. The reference electrode was placed on a less obtrusive location, at the ventromedial forearm approximately 15 cm below the hand electrode (Fig. 1A). This is considered a relatively inactive reference site (Venables and Christie, 1980) which reduces signal amplitude but greatly adds to participant comfort. On the thenar eminence adhesive tape was used to reduce movement and improve fixation to the skin and the skin curvature. In addition, the wire was fixed by means of tape to the skin 10-15 cm from the electrode, so the participants were able to move their hand in all directions without exerting pull on the electrode.
We chose to use different electrodes for the active and reference sites to optimize signal quality. On the thenar eminence disposable Biopac Systems EL507 EDA isotonic gel electrodes (Biopac systems Inc., Goleta, US) were used. These electrodes are designed for electrodermal activity measurement and are pre-gelled with isotonic gel (Ag/AgCl contact, wet liquid gel (0.5% chloride salt) electrolyte, 11 mm diameter contact area). Following guidelines, no preparations were performed on the skin to preserve its electrical properties (Dawson et al., 2000) and electrodes were placed at least 5-10 min before the start of the experimental procedure to avoid decreased conductance due to electrolyte penetration of the stratum corneum from the isotonic gel . On the ventromedial forearm 55 mm Kendall H98SG hydrogel ECG electrodes (Medtronic, Eindhoven, Netherlands) were used. ECG electrodes are designed to detect the electrical currents of the heart. For ECG recording EDA is considered an artefact, therefore ECG electrodes contain a layer of electrically conductive gel between the skin and the electrodes to reduce resistance. By lightly scrubbing the skin with abrasive paper part of the stratum corneum was removed to further lowering resistance . Placing the inactive electrode on an electrodermal inactive site with very little resistance provides a higher and cleaner EDA signal compared to placing the inactive electrode on an electrodermal active site which resistance fluctuates with ongoing EDA.
EDA was recorded with a direct voltage of 0.5 V, a sampling frequency of 10 Hz, and 16 bit (A/D converter) precision in the 0-100 microSiemens (μS) range. EDA signal quality assessment was performed after completion of the recording with a simple automated artefact rejection algorithm (i.e. sudden drastic drops or increases in μS based on the first derivative, and flattening of the signal, verified by visual inspection) in MATLAB. Segments flagged as artefact were removed from further analysis. The EDA signal was filtered using a low-pass 0.5 Hz Butterworth filter to deal with noise and motion artefacts (Doberenz et al., 2011).
2.2.1.1.2. Wrist-based EDA. Wrist-based exosomatic EDA was obtained with a CE approved wearable skin conductance sensor type DTI5 (Discreet Tension Indicator version 5, Philips) (Fig. 1B), under development as a smartwatch for commercially availability to the consumer market. The DTI5 has a 47.1 * 15.5 * 47.8 mm casing and weighs 40 g. It contains two 'banana' shape electrodes made of black hydrophilic silicone rubber that are placed at a distance of approximately 1 cm (see Fig. 1). The band is placed directly behind the head of the ulna. Upon arrival at the laboratory participants had been wearing the device for already ~24 h. This allowed moisture under the silicone rubber to build up, which, from past experience in prototype testing, yields a better conductive contact between the skin and the electrodes.
The DTI5 applies a direct voltage of 1 V between both electrodes to measure skin conductance with a frequency of 160 Hz within a range of 0 to 24 μS and a precision of 22 bits. The DTI5 has an internal, on-line signal quality rating. The maximal quality rating of 3 is proportionally lowered based on the presence of certain features, for instance a change rate that exceeds plus 10% or minus 1% per second. The lowered quality value not only holds for the moment of the actual change rate disturbance, but starts 0.5 s prior to the detected disturbance and ends 5 s after. Data of quality 1 was considered to reflect an artefact and segments with quality rating lower than 1 were removed from further analysis. The 160 Hz data subsequently is low-pass filtered (cross-over 5 Hz) to remove repetitive distortions in the skin conductance signal that A B coincide with motion.

EDA measures.
The measures of interest that can be derived from both EDA signals are skin conductance level (SCL) and frequency of non-specific skin conductance responses (ns.SCR). Both these measures typically increase with increased SNS activity Posada-Quintero and Chon, 2020). For both devices SCL is calculated as the mean EDA level in μSiemens on the filtered artefact-free portion of every experimental condition. Peaks from both palm and wrist recordings were detected using the EDA master toolkit (Joffily, 2012) in MATLAB on the filtered artefact-free fragments of the EDA signal. As suggested by Braithwaite et al. (2013) ns.SCRs were counted if they had a peak amplitude threshold of 0.01 μS and rise time range of 0.1-5 msec (Braithwaite et al., 2013). The parameter for detecting responses in rapid succession (overlapping responses) was set to ON. The resulting total number of ns.SCRs_mat during an experimental condition were counted and divided by the artefact-free minutes of the corresponding condition to obtain ns.SCR frequency in peaks per minute. The DTI contains an internal method of peak detection that makes use of a curve fit method, yielding ns.SCR_cf (for details you can contact Luc Vosters (luc.vosters@philips.com). The correlation within each participant between the peaks detected on the wrist signal by the two methods (internal algorithm ns.SCR_cf vs. MATLAB algorithm ns.SCR_mat) was high (r mean = 0.80, IQR = 0.71-94). Even so, we present the results from both the device-internal and toolkit scoring algorithms jointly throughout. This allows comparison of palm and wrist using the same method of peak detection across, as well as comparison of palm and wrist that additionally uses a different method of peak detection for the wrist location.

Pre-ejection period
The PEP has been shown to be a reliable non-intrusive cardiac measure of SNS activity (Sherwood et al., 1990;Kelsey, 2012). PEP was obtained by calculating the time between the start of ventricular depolarization (Q onset) in the electrocardiogram (ECG) and the time the aortic valve opens (B point) in the impedance cardiogram (ICG) collected by the VU-AMS device. ECG and ICG were recorded from five adhesive 55 mm Kendall H98SG hydrogel ECG electrodes (Medtronic, Eindhoven, Netherlands) placed on the chest and back of the participants ( Fig. 2) with a recording frequency of 1000 Hz. The locations of the Q onset and B point are automatically placed by the Vrije Universiteit Data Acquisition and Management Software (VUDAMS, available at: http://www.vu-ams.nl/support/downloads/software/) and manually corrected after visual inspection when necessary.

Anthropometrics
The participant's body weight (kg) and body mass index (kg/m 2 ) were measured to reflect adiposity. After removal of shoes and coats, height was measured to the nearest millimeter using a stadiometer and weight was assessed to the nearest 0.1 kg using a digital scale. Body mass index (BMI) was calculated as body weight in kilograms divided by height in meters squared. Second, body fat distribution was measured using waist circumference (cm) and waist-to-hip ratio (W/H).

Fig. 2.
Electrode placement for ECG and ICG recordings. The electrodes were placed on top of the sternum at the suprasternal notch (1); at the bottom of the sternum on the processus xiphoideus (2); at the apex of the heart on the ninth left intercostal space (3); at the back, on the spine, at least 3 cm above electrode 1 (4); at the lower back, on the spine, at least 3 cm below electrode 2.

Interview and questionnaires
A structured interview regarding the participant's demographics, medication use, perceived physical and mental health and lifestyle behaviors was performed to reconfirm participants met the inclusion criteria and to obtain a series of potential confounders/explanatory variables. Two additional questionnaires were supplied: 1) the Edinburgh handedness inventory (Oldfield, 1971) to determine to participants hand preference and 2) the Profile Of Mood Statesshort form (POMS), a psychological rating scale used to assess current overall mood state (McNair et al., 1971).
Affect was repeatedly rated during the experiment directly following certain tasks (see Table 1) by the Maastricht Questionnaire (Myin-Germeys et al., 2001). Positive affect scores were obtained by asking the participants to rate on a scale of 1 (not at all) to 7 (very) whether they felt relaxed (System, n.d), cheerful, enthusiastic and content and averaging the score over the 4 items. Negative affect was obtained by averaging the scores for 5 items: insecure, lonely, anxious, irritated, and down.

Experimental tasks
2.5.1.1. Posture. Changes from a supine to a sitting to a standing position are well-known to generate a stepwise increase in SNS activity. Impact of postural manipulation on our SNS activity measures was obtained by having the participants lie down on a stretcher bed, sit upright on a comfortable chair with both feet on the ground and stand upright, each for 3 min.
The TA aims to induce "effortful active coping" (de Geus et al., 1990;van der Mee et al., 2020). During the TA participants have to react to a stimulus (an "X") that flares up irregularly in one of the corners of a computer screen. Participants have to respond as fast as possible to this stimulus by pressing the button opposite to this corner on their response panel. During the tone avoidance task incorrect or too slow responses are punished with a red bar and a loud noise burst. Correct responses are rewarded by a green bar.
The PASAT is a measure of cognitive function that assesses capacity and rate of information processing and sustained and divided attention (Tombaugh, 2006). The PASAT is presented using prerecorded audio to ensure standardization in the rate of stimulus presentation. Single digits are presented at short intervals, traditionally every 3 s, and the respondent must add each new digit to the one immediately prior to it. Responses are made by clicking the corresponding answer (0-18) using a mouse, and must be given before the next stimulus is presented. Feedback is given by a green checkmark in case of a correct and timely answer, or a red x when the answer is wrong or too late. Shorter interstimulus intervals are known to increase the difficulty and perceived stressfulness of the task.
In the current implementation of the TA and PASAT tasks a staircase algorithm was used that adapted the criterion reaction time to the participant's average reaction time. This ensures that the level of difficulty is tailored to the skills of the participants which may vary due to e. g. age or educational attainment. In addition, the application of such a staircase maintains task difficulty during repeated exposure: both the TA and PASAT tasks were repeated twice which might induce habituation. To further ensure sufficient effort and engagement of the participants with these tasks a competition was set up in which the three best performing participants would gain an additional monetary reward of 50 Euros. A large and visible score board was used to keep the score, identifying participants by their participant ID code.
The SSST short is a recently developed adaptation of the Sing-a-Song Stress Test aimed at measuring social-evaluative stress in a quick and easy manner (van der Mee et al., 2020). In this test participants are told that they had to sit as still as possible in front of a computer (surrounded by cameras and voice recording equipment) while they are shown several messages, followed by a clock counting down from 60 to 0 s. They are informed that some of these messages only need to be read whereas others will contain instructions they have to follow when the counter reaches 0. One of these instructions is to sing a song of their choice out loud. The instructions additionally mention that their performance is recorded and will later be studied by conservatory students. The anticipatory interval of 60 s before the participant started singing was the stressor of interest, unaffected by the movement involved in the act of singing itself.
Raven's progressive matrices test is a nonverbal IQ test typically used in educational settings. It is a 60-item test, listed in order of difficulty, used in measuring abstract reasoning and regarded as an estimate of non-verbal fluid intelligence (Raven, 2003). In each test item, the participant is asked to identify the missing element that completes a pattern. We used the original test items; however we only gave the participants 4 min to complete the test, which is far too short to complete all items. The test was administered on a tablet computer and the remaining time was shown in bright red in the right corner of the screen. Beneath the timer their progress and number of errors were presented, further increasing the ego-threatening aspect of IQ testing.

Physical stressors.
To examine how the EDA measures captured the effects of general everyday life activities on SNS activity, several typical everyday life activities were conducted during the laboratory session (see Table 1, experimental timeline). Mild to moderate physical activity was induced by self-paced walking (at the pace they normally walk), fast walking (the pace they walk when they are in a hurry), bicycling, stair climbing and descending, mock dish-washing (without actual water and soap) and vacuum cleaning. To examine how the EDA measures captured standardized physical activity, participants had to jog/run on a treadmill at 3 incremental stages of speed (males: 5, 6.5, 8 km/h; females: 4.5, 6, and 7.5 km/h), each lasting 4 min. After a 3-minute cooling-down on the treadmill (males: 4 km/h, females: 3.7 km/h) participants sat down for a 3-minute recovery stage.

Procedure
The full research project included an initial data collection phase in a real-life ambulatory (~24 h, including the night) setting, but here we focus on the second phase, the standardized laboratory validation (~2,5 h of experimental manipulations) of wrist-based EDA obtained from the DTI5 device. During their initial visit to the laboratory (~1 h) at the start of ambulatory recording, participants provided informed consent, anthropometrics were measured, and the structured interview and questionnaires were administered. Subsequently, equipment for monitoring SNS activity was applied to the participant, with the EDA electrodes of the VU-AMS device and the DTI5 device on the non-dominant hand and wrist.
Once equipped with the measuring devices, participants left the laboratory for a day of ambulatory monitoring. They returned the next day for participation in the laboratory protocol. Upon their return, it was verified that all the measurement equipment was still in working order. Next, participants were informed that footage of their facial expressions, posture and voice would be recorded during the experiment. Furthermore the participants were informed that during the tasks, including the SSST short , the experimenter would monitor their performance through a one-way mirror to ensure good compliance and quality of the recordings. Then all experimental manipulations were presented in a fixed order (see Table 1).
After the experimental session, all devices were removed and participants were provided the option to use a nearby shower. The experiment ended with a debriefing in which they were informed that the TA, PASAT and Raven tasks were purposefully made so difficult so that they would be impossible to perform without errors. They were explicitly told that the test score rankings were only added to increase the stressfulness of the task and did not reflect their actual ability, and their performance on the RPM test is no meaningful reflection of their intelligence. Furthermore, they were informed that their singing during the SSST short was not actually recorded and is not going to be studied by conservatory students. Nevertheless, the best performers on the TA and PASAT tasks were rewarded with an extra 50 euros, as promised.

Analytic strategy 2.7.1. Data inclusion and quality
To assess data quality the average percentage of artefact free signal per participant was calculated for both EDA signals: a condition of a participant was considered useable for analysis when the duration of valid data in the condition was at least 30 s, and when at least 20% of the signal of the entire experimental condition was artefact free for both DTI5 and VU-AMS signals. Otherwise the data for the whole condition was rejected. We decided to only include participants that had at least 3 useable conditions.
Under classical signal detection theory we expect a lower EDA level in the wrist signal compared to the palmar signal since the density of sweat glands is ~5 times larger on the palm than on the wrist. The amount of detected peaks is therefore also expected to be lower on the wrist. A study by Payne et al. (2016) showed that only in 30% of the cases when an SCR occurred at the fingers there was a simultaneous SCR at the wrist. However, during a stress task this percentage rose to 72% (Payne et al., 2016).
To assess the extent in which EDA levels were very low, making it difficult to filter signal from noise, the percentage of participants in which the average SCL was below 0.5 μS was calculated for both EDA signals (Milstein and Gordon, 2020). Due to the lower EDA on the wrist we also expect less ns.SCRs to be detected. The percentage of conditions where the number of ns.SCRs detected by either internal or the Matlab Method was zero, i.e. no detected peaks at all, was calculated and compared for the EDA signals of each participant.

Data alignment and reduction
For accurate device-to-device comparisons, we synchronized the DTI5 and VU-AMS recordings by temporally aligning the EDA signals to the maximal cross correlation between the tri-axial accelerometer signals of both devices. Next we retained only data from the artefact free segments that fell within one of the experimental conditions.
In the present study the parameters of interest were defined as responses to short-term stressors and physical activities. Therefore for all wrist-based and palm-based SCL measures and the PEP, a mean value was generated across the same start and stop times for all conditions for each participant up to a total of 26 conditions, consisting of 3 posture conditions (lying, sitting and standing), 4 first-exposure mental stressors, 2 repeated mental stressors, 6 daily life activities, a physical stressor consisting of 4 levels, and 7 recovery periods separating the stressors. For the ns.SCR measures, the frequency of the peaks for each of the conditions was retained. Outlier detection and removal was performed on these measures using a 3.5 SD criterion together with careful visual inspection of the histograms.

Multilevel analyses
Across participants, EDA data were available for analyses from at least 13 conditions, with an average of 25.8 within-subject observations.
To take into account that these observations are nested within participants we performed multilevel (ML) analyses, also referred to as linear mixed models or hierarchical linear models. Although Bland-Altman plots have been suggested as the appropriate method for device comparisons (van Lier et al., 2019), they are less suitable here, as we anticipate large between-subject differences in e.g. absolute SCL values at the palm and the wrist, and are primarily interested in the correspondence of within-subject changes in e.g. palm-based and wrist-based EDA, wrist-based EDA and PEP, and wrist-based EDA and affect.
A basic two level ML model can be represented by the following formula in which the outcome variable Y is a function of the intercept β 0j , a predictor variable X and a random error term (Blackwell et al., 2006): The lower level (level 1) is indexed by the subscript i and the higher level (level 2) by the subscript j. In this study the individual participants are treated as the level 2 unit, and the repeated measures across the various conditions within a participant as the level 1 unit.
ML analysis possesses a number of favorable characteristics suited well to our design. For instance, it does not require the number of repeated measures to be equal for all subjects and therefore is robust to missing data (assuming missingness at random). ML analysis can also explicitly test for the need to model inter-individual variation of the intercept and slope of the relationship between predictor and outcome. This means that each individual participant has its own intercept and slope coefficient value. Therefore the β 0j and β 1j coefficients that are predicted by the model can be further broken down into a mean intercept ϒ 00 and mean slope ϒ 10 with deviations from that mean U 0j and U 1j : For each model specified it can be tested whether allowing the intercepts and slopes to vary improves the fit of the model. However, to allow for direct comparison of the different analyses, all models were run including a random intercept and random slope even if this did not improve model fit. Cross-level interaction effects between our level 1 predictor variable with age, biological sex and BMI were tested by adding all interactions to a single model. However, none of the analyses showed an interaction effect of p < .05 with any of the level 1 variables rendering them obsolete.
Predictor variables were participant-mean centered. To do so the mean predictor value over all experimental conditions was calculated for every participant. This participant-specific mean was then subtracted from this participant's observed values during all experimental conditions. By centering, the intercept of each individual participant can be interpreted as the expected value of the outcome when the predictor values equal their own mean score.
For all outcome variables the total variance across conditions and participants was calculated, as well as two intra-class correlations (ICC) representing the amount of total variance that could be explained by inter-individual differences, and the amount of variance that could be explained by the experimental manipulations.
Our main validation analyses revolve around the prediction of a criterion outcome by the wrist-based EDA measures. In these analyses we are primarily interested in the proportion of variance in our outcome variable explained by the predictor variable. Because no standard solution is available for calculating this explained variance in a full ML model, we applied two strategies. First, the explained variance of the outcome by the predictor was calculated by the formula: In which β 1j is the estimated slope of the full ML model, ε ij (predictor) the residual variance of a ML model using a random intercept only, and ε ij (model) the residual variance of the full ML model. This formula is an adjustment of the standard coefficient of determination, the proportion of the variance in the dependent variable that is predicted by the independent variable, used in linear regression. Secondly, we calculated the more intuitive within-subject correlation between the outcome and predictor for each individual participant separately and report the mean correlation and interquartile range, as well as the squared mean correlation as an approximation of the average proportion of variance in our outcomes that could be explained by the predictors across all participants.
All analyses were performed in R version 3.5.2. All ML analyses were performed using the packages lme4 and lmertest. Models were estimated under restricted maximum likelihood, with random intercepts and random slopes set as correlated and using the optimizer "nlminbwrap" to aid convergence problems. To test for autocorrelation effects the ML models were rerun with lme, of the nlme package, setting correlation to corAR(). The results with and without specified autocorrelation were almost identical. Therefore the results presented in this paper are limited to models without autocorrelation. The threshold for significance was set to p = .001. All analyses are performed in R-studio version 3.6.1 and the multilevel analysis package "lmerTest".

Correspondence between palm-based and wrist-based EDA measures.
To test the correspondence between within-subject changes in classic VU-AMS palm-based EDA and the new DTI5 wrist-based EDA, ML regression analyses including all 26 experimental manipulations were performed for our EDA measures SCL and ns.SCR_mat and ns.SCR_cf. The VU-AMS EDA measures were added as outcome variables and DTI5 measures as the predictor variables.

Construct validity of palm-based and wrist-based EDA measures.
In testing the effects of our experimental manipulations on SNS activity we focus on the classical reactivity contrasts of 'stress level compared to baseline level'. Because the classical repeated measures ANOVA will delete persons that have missing values for one or more conditions, we instead used a ML approach that is robust to missingness and uses all observed data. For the four EDA measures and the PEP we performed a ML regression in which the experimental condition was used as a categorical predictor variable. Contrasts were specified that compared each condition to the appropriate sitting or standing baseline, by using that baseline as the first level of the experimental condition variable. In the model output, the estimated intercept represents the mean baseline, and the ensuing estimates for all the remaining conditions represent the deviations of these conditions from the baseline, with the p-value specifying whether the difference is significant.
Because posture itself has an effect on SNS activity, two separate analyses were performed. One for the mental stress tasks, which were all performed while sitting and therefore have the sitting quietly posture condition as baseline, with ten contrasts; four first exposure mental stressors, two repeated mental stressors and four recoveries. And one for the physical stress tasks, which were all performed upright and therefore have the standing quietly posture condition as baseline, with eleven contrasts; six daily life activities, one recovery and four-levels of physical stress on a treadmill.

Criterion validity of palm-based and wrist-based EDA measures.
In order to test whether changes in the skin-based measures of SNS activity show the same pattern as cardiac measures of SNS activity, we performed ML regression analyses using the cardiac measure PEP as the outcome variable and the EDA measures of both the VU-AMS and DTI5 as the explanatory variables. We excluded the lying down condition because PEP is known to be sensitive to the large preload effects in this posture (Houtveen et al., 2005), leaving 25 conditions for the PEP-EDA comparisons. Earlier work has shown that PEP was significantly correlated with the EDA measures SCL and ns.SCR frequency, particularly when both showed large variation due to the inclusion of physical stressors (Goedhart et al., 2008). Because the current study used multiple mental and physical stressors, a sensitivity analysis was performed by repeating the analyses separately for the mental stressors, including the four first exposure mental stressors, two repeated mental stressors and four recoveries, and the physical Stressors, including the six daily life activities, one recovery and four-levels of physical stress on a treadmill.

Predictive validity.
To test whether changes in EDA measures could predict concurrent changes in affect induced by our experimental manipulations we performed ML regression analyses using positive and negative affect as the outcome variable and the five EDA (SCL -palm & wrist, ns.SCR -palm, wrist ns.SCR_mat, and wrist ns.SCR_cf) measures as the predictor variables. Due to the potential effects of the long physical activity session on affect, mixed with potential effects of task habituation, we limited the scope of our analysis to the baseline mood report and the 4 reports taken after the first exposure to the mental stress tasks, which all took place before any of the physical stressors. For comparison the predictive validity of the PEP is also given.

Power calculation
The a priori power calculations for ML models necessarily involve many choices concerning peripheral (nuisance) parameters (variance of the intercept, auto correlation of predictor X, autocorrelation of residual Y, amount of missing data), and concerning the focal parameters of interest (mean and variance of the slope). The most critical parameter is the within-person correlation. Which we conservatively set to 0.1 for the fluctuations in EDA and fluctuations in 3MQ mood, i.e. an R2 of 1%. All other correlations, i.e. within the physiological domain across different instruments measuring the same variable (e.g. wrist-based or hand palm based EDA) and across different variables measuring the same construct (e.g. EDA and PEP) are expected to be (much) higher. As shown in the Appendix, in as sample of N = 120 individuals the power to detect 1% explained variance in an outcome by a predictor using T =~26 repeated measures at a p-value of p = .01 was 0.975, even when we allow for differences in the slope across persons and a missingness of up to 40%.
Four participants did not have VU-AMS recordings for EDA or PEP because of a failure of the memory card. Six participants had a too low ICG quality to be included in the criterion validity analyses. Eight participants did not have DTI5 wrist-based data for the following reasons: 1) DTI5 battery was insufficiently charged (N = 2); 2) DTI5 removed because of participant discomfort (N = 1); 3) DTI5 recording error (N = 5). Nine of the above participants overlapped, in that they suffered from multiple sources of data loss (e.g. low ICG quality and DTI5 removed). This resulted in a population of 112 participants that could be included in the analyses (57% females, age range = 18-32, mean age = 22.3, SD = 3.4).

Data quality
On average the length of the active experimental conditions added together was 76 min. This excluded down-time between conditions. Because some conditions were not performed or shortened for certain individuals (e.g. speed of the treadmill was too high for their fitness level) gross recording lengths ranged from 67 min to 79 min. Data quality of both devices was good. On average 86.5% of the recorded palm EDA signal was artefact free, while 88.8% of the recorded wrist EDA signal was considered artefact free. When assessing the occurrence of low absolute levels of skin conductance, we observed that in 14.9% of the participants the average wrist SCL was below 0.5 μS. This is considerably less problematic than suggested in a previous study that found that 73% of all wrist EDA data was below 0.5 μS (Milstein and Gordon, 2020). Structural low SCL did not occur on the palm.
For the wrist, 23.1% of the participants had an absence of ns. SCR_mat in more than half of the experimental conditions (≥13), this was only 9.91% for ns.SCR_cf. This is more than at the palm where all participants had at least one ns.SCR in the majority of experimental conditions, only 1.6% of the participants had an absence of ns.SCRs in less than 3 out of 26 conditions. In 28.44% of all observations the wrist did not detect a single ns.SCR_mat, while there was at least one ns.SCR detected at the palm. For ns.SCR_cf this was 23.10%. The better performance of the curve fit method (ns.SCR_cf) with respect to peak detection can be explained by the optimization of the curve fit method for the detection of peaks at the wrist specifically, taking into account the lower absolute level of conductance and the morphology of motion artefacts at that location. Fig. 3 shows that the most of the experimental conditions in which peaks were only detected at the palm and not at the wrist (using the Matlab method, the results for the curve fit method where highly similar and are shown in Supplementary Fig. 1) are during conditions with no or low physical activity. We believe that this effect is driven by the build-up of moisture seen at the wrist in the more physically demanding conditions. During physically non-engaging activities there is usually low sweat production at the wrist. When sweat levels are low, no moisture build-up has taken place between the sensors and the skin. This makes it very difficult for the sensors to detect the ns.SCRs despite them being Fig. 3. Percentage of participants that had at least one ns.SCR_mat detected at wrist, palm, both, or neither, separately per experimental condition. Fig. 4. Direct comparison of EDA on the palm and the wrist of a single participant during rest (sitting), a mental stressor (Tone Avoidance task) and a physical stressor (walking on a treadmill at a fast pace. The top panels show the EDA recording of the palm, the bottom panels the recording on the wrist. Above each plot the ns.SCR frequency in peaks per minute is reported. The Y-axis of the left panels applies also to the middle and right panels.
present. This dependency on sufficient amounts of sweat to detect EDA is a known disadvantage of dry electrodes. Fig. 4 shows an example of a wrist and a palm EDA signal for a single participant during three different conditions.

Correspondence
The ICC analysis showed that inter-individual differences explained 80.7% of the variance in palm SCL, while only 5.8% was explained by the experimental manipulations. For wrist SCL, 58.4% and 18.7% of the variance was explained by inter-individual differences and experimental manipulations, respectively.
For palm ns.SCR frequency, 15.0% of the variance was explained by inter-individual differences and 46.3% by the experimental manipulations. For wrist ns.SCR frequency, 16.1% of the variance in ns.SCR_mat and 3.9% of ns.SCR_cf was explained by inter-individual differences and 52.0% of variance in ns.SCR_mat and 63.9% in ns.SCR_cf by the experimental manipulations, respectively. In general, we find that interindividual differences explain the largest part of variation in SCL, whether at palm or wrist, whereas the experimental manipulations are the major source of variance for the ns.SCR frequency at both locations.
To test the correspondence between the EDA measures from different locations, we predicted the palm-based EDA measures by their wrist counterparts. There is a significant correlation between wrist and palm EDA (Table 2). Of the variance in palm SCL 13.5% could be explained by wrist SCL, while 20% of the variation in palm ns.SCR frequency could be explained by wrist ns.SCR_mat and 19% by ns.SCR_cf.

Construct validity
The results of the construct validity analyses are shown in Table 3. Clear confirmation of successful experimental manipulation of SNS activity comes from the changes in PEP across conditions. Both mental and physical stressors lead to a decrease in the PEP over the appropriate β 0j and В 1j are the average intercept and slope of the regression of wrist-based on palm-based measures, SE and p values of the ML model with random intercept and slope show that individual variation around both (U0j) and (U1j) are significant. Model R 2 is the percentage of within-subject variance explained in palm EDA by wrist EDA. This value is also approximated by the correlation R 2 which derives from squaring the mean (M) within-subject correlations, the range in which is indicated by the IQR. Significant results are depicted in bold. ▴ significant increase in SNS compared to posture-specific baseline. ▾significant decrease in SNS compared to posture-specific to baseline. X no significant change compared to posture-specific to baseline. = no significant change from stress level. + unexpected significant increase in SNS compared to stress level. − expected significant decrease in SNS compared to stress level, but incomplete recovery compared to baseline.
--expected significant decrease in SNS compared to stress level and complete recovery compared to baseline. Note: for PEP higher levels indicate less SNS activity. So levels that are lower compared to baseline are marked as an increase in SNS activity.
sitting and standing baselines. During recoveries, the PEP values bounce back to the baseline values, with the exception of standing recovery after stair climbing where an increase in SNS activity remains evident. All experimental effects on the palm EDA measures were in the expected direction. Both mental and physical stressors increased SCL and ns.SCR over the appropriate sitting and standing baselines. During recoveries, values decrease compared to the previous stress level, although they remain elevated compared to the baseline. When comparing the beginning and ending of the experiment there is no sign of a clear drift in the palm SCL signal that keeps a steady average of around 13 μS with reliable increases during experimental manipulations.
Results for wrist SCL do not show the expected pattern, with SCL showing (non-significant) decreases rather than increases during exposure to the mental stressors, and SCL even decreased in response to some of the physical stressors indicating a clear drift in the signal over time. Furthermore, absolute SCL levels on the wrist were considerably lower compared to the palm. In contrast, results for wrist ns.SCR are again very consistent with the expectations in both mental and physical stressors for both methods: Generally ns.SCR frequency increases over the appropriate sitting and standing baselines and during recoveries the ns. SCR frequency is seen to drop compared to the previous stress level. This indicates that ns.SCR on the wrist can track SNS activity independent of thermoregulatory need. The exception was the cool-down phase of the treadmill protocol where SNS activity should have abided but ns.SCR frequency was seen to remain high at both palm and wrist. This exception is likely to reflect ongoing thermoregulatory sweating to restore core body temperature, which is known to be prolonged after exercise cessation (Kenny and McGinn, 2017). Fig. 5 presents a direct visual comparison between the mean ns.SCRs of the palm and wrist over the course of the experiment.

Criterion validity
The ICC showed that 32.1% of the variance in the PEP could be explained by inter-individual differences and 42.9% by the experimental manipulations.
To test whether electrodermal measures of SNS activity show the same pattern as cardiac measures of SNS activity, we compared the EDA measures of the palm and wrist to the PEP. As expected, all EDA measures showed a significant negative relationship with the PEP (Table 4), with a lower PEP being associated with higher EDA. When comparing the available EDA measures, for both palm and wrist EDA variability in PEP was best explained by ns.SCR frequency, in which palm ns.SCR frequency explained 21% of the variability in PEP and wrist ns.SCR_mat frequency explained 14.5% and ns.SCR_cf explained 25.5% of the variability. SCL correlated more poorly to PEP, particularly SCL at the wrist. The sensitivity analyses in Supplementary Table 1 showed that overall these relationships are attenuated but still present when computed across the mental and physical stressors separately. Fig. 6 shows that all first exposures to the mental stressors significantly increased negative affect and decreased positive affect. We found that 43.0% of the variance in positive affect could be explained by interindividual differences and 15.3% by the experimental manipulations, while 68.4% of the variance in negative affect could be explained by inter-individual differences and 6.8% by the experimental manipulations.

Predictive validity
As shown in Table 5, positive and negative affect were significantly related to palm SCL, palm ns.SCR frequency, and wrist ns.SCR frequency (with an exception for ns_SCR_mat which only showed a trend for negative affect), but not to wrist SCL.
The relationships between EDA and affect were in the expected direction with a higher SCL and ns.SCR being associated with lower positive affect and higher negative affect (Cacioppo et al., 1993;Nikula, 1991;. However, the explained variance was very low, in part because the variation in this Likert type scale was very low, and R 2 of the ML model converged to zero (not shown). When we approximate explained variance by the squared within-subject correlations only palm SCL and ns.SCR frequency explained a meaningful part of the variance in affect (5.5%-13.5%).

Discussion
In an extensive controlled laboratory study in 112 participants we recorded two measures of skin SNS activity, SCL and ns.SCR frequency, using both wrist-based dry electrodes and classical palm-based wet electrodes. Throughout we find that the variance in absolute SCL at both palm and wrist is mainly determined by between-subject differences but only weakly by experimental manipulations even if these induced a large range of SNS activity. In contrast, variance in ns.SCR frequency at  the palm and the wrist was predominantly governed by experimental conditions. The ns.SCR frequency therefore seems a superior measure than SCL to detect within-subject changes in SNS activity across conditions. In addition, whereas both SCL and ns.SCR frequency may tackle relevant individual specific factors like chronic stress or personality traits, SCL may also entail anatomical features like the number of sweat glands per mm 2 skin, exact electrode positioning, hydration status at the time of recording, and other factors that are usually not germane to psychophysiological research. Those factors seem to plague the nonspecific SCRs to a lesser degree. The analysis of the correspondence between palm and wrist measures also favored ns.SCR frequency, as the explained variance in palm EDA by wrist EDA was larger for ns.SCR than for SCL (20% vs. 14%), although both were modest in keeping with findings for SCL from previous studies (Milstein and Gordon, 2020;Menghini et al., 2019). The preferred use of ns.SCR frequency over SCL is further supported by a comparison of the performance of ns.SCR vs. SCL in the validity tests. Construct validity was higher for ns.SCR frequency than for SCL most notably at the wrist, and the criterion validity, using the PEP as the criterion for SNS activation, was also better for ns.SCR frequency at both palm and wrist. Furthermore, we found stronger predictive validity for changes in positive and negative affect using ns.SCR frequency than using SCL, and for the wrist only ns.SCR frequency was a significant affect predictor, although the explained variance was low for all signals (<14%). Finally, for wrist EDA the dependency of absolute SCL on thermoregulation was observed to a much lesser extent for the wristbased ns.SCR frequency. Although thermoregulation remains a powerful co-determinant, our results for wrist ns.SCR frequency bolster the accumulating evidence (Machado-Moreira and Taylor, 2012; van Dooren et al., 2012) that refutes the traditional idea that detection of emotional sweat gland responding is confined to the palmar and plantar skin surface whereas only thermal sweating evokes responses of the sweat glands across other parts of the body (Dawson et al., 2000;Edelberg, 1967;Ogawa, 1975). Evidence of mental stress effects on the sweat glands at the wrist can, however, be detected only by ns.SCR frequency, and is indeed not seen in the SCL.
Taken together, our results lead us to conclude that ns.SCR frequency is a more suitable measure than SCL for prolonged ambulatory recording of SNS activity. We now turn to the question of whether the less invasive wrist-based recordings of ns.SCR can sufficiently capture SNS activity or whether the more obtrusive palm-based recordings are needed. In wet electrodes, the use of electrolyte cream on the palm increases conductance, while the dry electrodes are dependent on the presence of sweat to act as an electrolyte between the electrodes and the skin. This leads to much lower levels of absolute SCL, which in turn could make it more difficult to detect ns.SCRs. Indeed, in around 15% of the participants the

Posi ve affect
Nega ve affect Fig. 6. Experimental manipulation of affect. All first exposure stress tasks significantly decreased positive affect (p < .001) and increased negative affect (p < .001). Dots represent the mean; error bars represent the standard error of the mean. β 0j and В 1j are the average intercept and slope of the regression of the EDA measures and affect, SE and p values of the ML model with random intercept and slope show that individual variation around both (U 0j ) and (U 1j ) are significant. Because the variance in the Likert-scale based Affect outcomes was low, Model R 2 did not produce interpretable values, so we just report the approximation of the percentage of within-subject variance in affect explained by the EDA measures by the Correlation R 2 . This R 2 derives from squaring the mean (M) within-subject correlations, the range in which is indicated by the IQR. Significant results are depicted in bold.
average SCL on the wrist was below 0.5 μS, while this low level did not occur on the palm. While substantial, this percentage is considerably less problematic than suggested in a previous study that found that 73% of all wrist EDA data was below 0.5 μS (Milstein and Gordon, 2020). Even so, we did find that the absolute number of ns.SCR peaks detected was considerably higher at the palm than at the wrist. The issue of a lower number of EDA responses at non-palmar sites is a longstanding one (Rickles Jr and Day, 1968) and was investigated by Payne et al. (2016).
In two small samples of students they noted that only 16% to 31% of the SCRs to emotionally salient pictures at the palm were simultaneously detected at the wrist, i.e. 69% to 84% of the orienting response induced palm SCRs were not detected at the wrist (Payne et al., 2016). This is likely related to the lower amount of eccrine sweat glands on the wrist (Harker, 2013) but also the use of wet vs. dry electrodes.
In spite of the lower absolute number of ns.SCRs, good construct and criterion validity for wrist-based ns.SCR was found. Mental and physical stressors, followed by recovery periods induce the expected changes in ns.SCR and stressor-induced decreases in PEP are significantly associated with increases in the ns.SCR frequency for wrist-based measurements. About 14.5-25.5% of the variance in PEP was recaptured by the wrist-based ns.SCR measures. Also for palm-based ns.SCR this value is low at 21%. This may appear modest if we presume the SNS to always act as a completely unitary system with tightly parallel changes in outflow to all organs at once. However, such a unitary SNS response is unlikely to occur. Direct sympathetic nerve activity recordings and noradrenaline spillover studies have shown substantial regional specificity of SNS activity (Wallin, 2004) that would allow the SNS activity to skin and heart to be less than perfectly correlated. Moreover, various other factors will act to reduce the PEP -EDA correlation even if SNS activity to all organs was perfectly aligned: PEP is sensitive to preload and afterload effects (Lewis et al., 1977) and the indirect action of circulating levels of catecholamines on the ventricular β1 and β2 receptors. With these caveats on 'unitary SNS activity to all organs' in mind, the correlation found here between the PEP and the wrist-and palm-based ns.SCR measure is in the expected range.
A previous study had suggested that the ns.SCR-PEP relationship might be seen only when using a wide range of SNS activity, i.e. by adding intense physical exercise (Goedhart et al., 2008). For stress researchers, the more relevant question is whether changes in skin and cardiac SNS activity are also correlated during exposure to mental stressors. We therefore repeated the analysis separately for the mental stress and recovery periods (all sitting), excluding all physical active conditions. This analyses showed that changes in ns.SCR were still related to changes in the PEP although the effect was strongly attenuated.
Regarding predictive validity, we found significant within-subject correlations between changes in affect induced by the mental stressors and changes in both palm and wrist ns.SCR frequency. Consistent with the literature, higher ns.SCR was associated with decreased positive affect and increased negative affect (Cacioppo et al., 1993;Nikula, 1991;. The mean correlation for wrist-based ns.SCR frequency with affect was very modest, at − 0.13 for positive affect and 0.16 for negative affect. Such low correlations are in keeping with a long history of modest relationships being reported between affective states and physiology (Cacioppo et al., 1993). This may in part reflect the restricted range of variance in affect. Although our mental stressors significantly lowered positive affect and increased negative affect, on average these changes amounted to about 1 point on a 7-point Likert scale suggesting that the induced stress was relatively mild, which was further corroborated by the average PEP reactivity of − 6.5 ms across all mental stress tasks. This is a limitation of artificial laboratory stressors in general; they cannot fully reconstitute the more profound stress experienced in real life daily situations.
Taken together, the results of our validity analyses suggest that wristbased ns.SCR frequency is a useful addition to the ambulatory psychophysiologist's toolkit. It responds to our experimental manipulations of SNS activity shows a decent overlap with parallel recorded cardiac SNS effects as measured by the PEP. Performance of wrist-based ns.SCR was in many aspects comparable to the more obtrusive palm-based ns.SCR, but the latter shows higher absolute levels of ns.SCR frequencies and remains superior in higher predictive validity for changes in affect. In controlled laboratory studies, palmar based EDA recording, therefore, remains the preferred method. The inherent limitations of wrist-based EDA recording should, however, be properly weighed against its huge advantages. Wrist-based ns.SCR frequency detection could feasibly be scaled up to epidemiology-sized studies including thousands of participants. In addition, prolonged recording for days to weeks and even months is possible, allowing the monitoring of daily SNS activity in relation to sleep and sleep quality (Sano and Picard, 2011;Sano et al., 2014), academic performance (Zhang et al., 2018), weekly fluctuations in work stress exposure, and longer term mood regulation in naturalistic social settings (Sano et al., 2018;Weise et al., 2013). In contrast to all other known 'pure' SNS measures wrist-based EDA has the potential to be employed as a biofeedback tool for just-in-time adaptive interventions (Heron et al., 2017). Such interventions use early signs of stress from the physiological state of a client to time the provision of (smartphone-based) alerting and/or coaching to prevent the client from cascading into a chronic stress response. We therefore see our study as a strong justification for the further technical and methodological development of wrist-watch based recording of skin conductance as way to measure SNS responses to perturbations by mental, emotional and physical stressors.

Declaration of competing interest
The authors declare no conflicts of interest beyond their affiliations.