On the relationship of arousal and attentional distraction by emotional novel sounds

Unexpected and task-irrelevant sounds can impair performance in a task. It has been shown that highly arousing emotional distractor sounds impaired performance less compared to moderately arousing neutral distractor sounds. The present study tests whether these differential emotion-related distraction effects are directly related to an enhancement of arousal evoked by processing of emotional distractor sounds. We disentangled costs of orienting of attention and benefits of increased arousal levels during the presentation of highly arousing emotional and moderately arousing neutral novel sounds that were embedded in a sequence of repeated standard sounds. We used sound-related pupil dilation responses as a marker of arousal and RTs as a marker of distraction in a visual categorization task in 57 healthy young adults. Multilevel analyses revealed increased RT and increased pupil dilation in response to novel vs. standard sounds. Emotional novel sounds reduced distraction effects on the behavioral level and increased pupil dilation responses compared to neutral novel sounds. Bayes Factors revealed strong evidence against an inverse proportional relationship between behavioral distraction effects and sound-related pupil dilation responses for emotional sounds. Given that the activity of the locus coeruleus has been linked to both changes in pupil diameter and arousal, it may embody an indirect relationship as a common antecedent by the release of norepinephrine into brain networks involved in attention control and control of the pupil. The present study provides new insights into the relation of changes in arousal and attentional distraction during the processing of emotional task-irrelevant novel sounds.


Introduction
A sudden cry can capture our attention, impair performance in a task at hand and increase the level of arousal to prepare us for a fight or flight reaction. The orienting of attention and the increase in arousal are two aspects of the orienting response, reflecting costs of attention distraction and benefits of arousal increase (Näätänen, 1992;Sokolov, 1963). The present study investigates the direct relation of distraction costs and arousal benefits by the co-registration of performance and pupil size in a well-established distraction paradigm.
The involuntary capture of attention by unexpected stimuli occurring outside the current focus of attention enables the detection of potentially relevant events in the environment (e.g., a ringing smartphone). The involved orienting and evaluation mechanisms include capacity limited processes (Näätänen, 1992) that can result in impaired performance in a task (distraction effect, e.g., Escera, Alho, Winkler, & Näätänen, 1998;Schröger & Wolff, 1998). The underlying mechanisms have been described by a three-stage model of involuntary attention (e. g., Escera & Corral, 2007;Schröger, 1997). In the first stage a predictive model of the acoustic environment is created automatically. Unexpected sounds (e.g., new sounds) violate the prediction (Winkler, Denham, & Nelken, 2009;Winkler & Schröger, 2015; but see, May & Tiitinen, 2010). This can trigger an orienting of attention and further evaluation of the unexpected sound. If no adaptation of behavior is required, attention is reoriented to the task at hand. When applying an oddball paradigm including frequently repeated standard sounds and rare, randomly presented distractor or oddball sounds (also termed novel or deviant sounds), distractor sounds frequently cause prolonged reaction times (RTs) in a task not related to the sound sequence or to the deviant feature (auditory task, Berti, Roeber, & Schröger, 2004;Horváth, Winkler, & Bendixen, 2008;Muller-Gass & Schröger, 2007;visual task, Escera et al., 1998;Parmentier, Elford, Escera, Andrés, & San Miguel, 2008).
Nonetheless, some studies reported reduced distraction effects or even improved performance when task-irrelevant emotional information was presented (Lindström & Bohlin, 2011;Lorenzino & Caudek, 2015;Max, Widmann, Kotz, Schröger, & Wetzel, 2015). These reduced distraction or facilitation effects have been thought to be caused by an increased level of arousal caused by the emotional content of the distractor event (Max et al., 2015). This explanation is in line with the models by Näätänen (1992) and Sokolov (1963) postulating that the orienting response includes costs of orienting and benefits of enhanced arousal (see also Hoyer, Elshafei, Hemmerlin, Bouet, & Bidet-Caulet, 2021). However, a direct relation between distraction effects and distractor-related changes in the arousal level was not yet evidenced in the context of emotional and novel sounds and will be investigated in the present study. Nonetheless, previous studies in the visual domain investigated this relationship already by means of emotional attentional blink tasks (McHugo, Olatunji, & Zald, 2013) and demonstrated that attention capture by distractors are modulated by state levels of arousal, in particular highly emotional stimuli facilitated the task performance (e.g., threat of shock, (Kim & Anderson, 2020;Kim, Lee, & Anderson, 2021;Lee, Itti, & Mather, 2012;Sutherland & Mather, 2015).
We used changes in pupil diameter as a marker of novel and emotional sound-related increase in arousal. Several studies confirmed that novel oddball sounds (Bonmassar, Widmann, & Wetzel, 2020;Wetzel, Buttelmann, Schieler, & Widmann, 2016;Widmann, Schröger, & Wetzel, 2018) and emotional events (Bradley, Miccoli, Escrig, & Lang, 2008;Bradley, Sapigao, & Lang, 2017;Hess & Polt, 1960) cause a transient dilation of the pupil. In recent oddball studies, emotionally negative novel sounds, that were interspersed in a sequence of repeated standard sounds, evoked stronger pupil dilation than emotionally neutral novel sounds (Bonmassar et al., 2020;Widmann et al., 2018). Additional evidence for emotional arousal being linked to an increase in pupil dilation is the simultaneous use of skin conductance responses and heart rate as a marker of emotional arousal. Recent studies observed a simultaneous increase in skin conductance responses and heart rate as well as pupil dilation responses to emotional stimuli (Bradley et al., 2008(Bradley et al., , 2017Wang et al., 2018).
In sum, cumulative evidence supports a model postulating that distraction effects include costs of orienting and benefits of arousal. To test this hypothesis, we applied a well-established auditory-visual oddball paradigm, including emotional highly arousing and neutral moderately arousing environmental novel sounds while participants focused on a visual categorization task. We expected first task-irrelevant novel sounds to prolong RTs compared to standard sounds (distraction effect; Escera, 1998;Schröger & Wolff, 1998) and increase PDRs (Murphy, Robertson, Balsters, & O'connell, 2011;Widmann et al., 2018). Second, we expected reduced distraction effects in response to emotional novel sounds compared to neutral novel sounds (Max et al., 2015) but increased amplitudes of the PDRs to emotional vs. neutral novel sounds (Bonmassar et al., 2020). Third, we hypothesized a direct relationship between emotion-related distraction effects and emotionrelated PDRs. That is, we expected concomitant shorter RTs and higher PDRs in trials with emotional but not for neutral novel sounds. Importantly, the relationship between RT and PDR needs to be analyzed at both within-and between-participant level to disentangle effects of the average PDR (e.g., participants with higher average PDR show smaller distraction effects) from effects at single trial level (e.g., smaller distraction effects occur in trials with larger PDR). This was achieved by means of adequate centering strategies in a linear mixed-effects models (for an example of different levels of analysis see LoTemplio, Silcox, Federmeier, & Payne, 2021). That is, we expected shorter RTs in trials with higher PDR for emotional but not for neutral novel sounds. Moreover, the present experimental approach includes ethically harmless stimuli and has no potential to cause anxiety, fear or threat. Thus, this paradigm can be applied in developmental studies even with young children as well as with patients, enabling insights in the relation between attentional distraction and arousal in developmental and clinical samples.

Participants
61 participants took part in the experiment. Four participants were excluded for the following reasons: pupil data available from one eye only, reaction times deviating more than two standard deviations from the average (two participants) and an accidental double participation in the experiment. The data of 57 healthy adults (M age = 25 years, range 18-36, 31 females, 5 left-handed) were used in the study. Participation was rewarded by money (10€/hour). All participants gave written informed consent. Participants confirmed a normal or corrected-tonormal vision, normal hearing, no medication with effects on the nervous system, and no history of attention-related disorders. Handedness was measured with an abbreviated German version of the Oldfield Handedness Inventory (Oldfield, 1971). The project was approved by the local ethics committee.

Auditory stimuli
A total of 48 environmental novel sounds 1 were collected from the database of a previous study (Max et al., 2015). Max and colleagues selected a set of 210 auditory stimuli, collected from the International Affective Digitized Sounds study (IADS, Bradley & Lang, 2007), and from other data bases as described by Max et al. (2015).
In the present study, sounds were allocated to three categories: 24 highly arousing emotionally negative sounds (for example an ambulance siren), 24 moderately arousing neutral sounds (for example toasting glasses), and 3 moderately arousing neutral sounds used as standard sounds (for example a musical instrument). Descriptive statistic and independent samples t-test are reported in Table 1 and Table 2. Sounds had a duration of 500 ms including faded ends of 5 ms. They were presented at a loudness of 54.5 dB SPL (measured with PAA3 PHONIC Handheld audio analyzer, Phonic Corporation, Taipei, Taiwan). Loudness of sounds was equalized with root mean square normalization. Sounds had been rated on a 9-point scale for valence (1 = unpleasant -5 = neutral -9 = pleasant) and arousal (1 = calm -9 = arousing).
1 All sounds used in the present study were environmental sounds. In the following we will omit the specification "environmental" and we will term the sounds as "standard", "emotional novel" and "neutral novel" in order to improve readability.

Visual stimuli
Three different target categories were presented in separate blocks: (a) princesses vs. knights (Fig. 1), (b) cats vs. hens, (c) butterflies vs. fish. For each target figure two versions were presented (slightly differing in shape, color and direction). All versions of the target figures were presented with equal probability (e.g. 25% princess with a pink dress, 25% princess with a blue dress, 25% knights with gray armor, 25% knight with blue armor) in a pseudorandomized order, that is, we implemented some constrictions to avoid biases which would appear in a complete randomized order. For example, the first two trials of each block were standard trials and we implemented two standard trials following a novel sound. For each target category, a different scene was used as a background. Princesses and knights were presented in front of a palace (left side) and a fortress (right side), cats and hens were presented in front of a basket (left side) and a hen-roost (right side), and butterflies and fishes were presented with a flowering shrub (left side) and a pond (right side). The background landscapes' pictures were displayed at the center of a screen with a size of 960 × 720 px, 267 × 200 mm, (24.3 • x 18.3 • visual angle from a viewing distance of 620 mm). Picture mean luminance without targets or feedback was 51.2 cd/m 2 on a gray background screen with a mean luminance of 2.9 cd/m 2 (princess/ knights 50.3 cd/m 2 , cat/hen 55.1 cd/m 2 , butterfly/fish 48.2 cd/m 2 ). The different versions of the targets and background scenes were presented to apply exactly the same paradigm to children in a future study (not reported here).

Apparatus and software
The auditory stimuli were presented via loudspeakers (Bose Companion 2 series III Multimedia speaker system) located at the left and the right of the screen. The visual stimuli were presented on a VIEWPixx/ EEG display (VPixx Technologies Inc.) with a resolution of 1920 × 1080 (23,6-in. diagonal display size) and a refresh rate of 120 Hz. Responses Statistically significant results are marked in bold.

Fig. 1.
Trial structure. In every trial, a sound was presented for 500 ms. 100 ms after sound offset, each sound was followed by the target (e.g. princess). The target was presented for 500 ms. Participants were instructed to press the left button when a princess appeared and the right button when a knight appeared. The response time window was 2000 ms after target onset. Correct responses within the response time window were directly followed by a feedback motion, which consisted of two images with a duration of 150 and 450 ms (total feedback duration 600 ms). The visual background was presented during the entire trial.
to the target were given pressing a button on a response box (RTbox) located in the front of the screen (Li, Liang, Kleiner, & Lu, 2010). The experimental stimulation was presented via Psychtoolbox (Version 3.0.15, Kleiner, Brainard, & Pelli, 2007) using Octave (Linux, Version 4.0.0).

Procedure
The experiment was conducted in an acoustically attenuated and electromagnetically shielded cabin. Illuminance of the cabin was held constant at a level of 48.9 lx (measured with MAVOLUX 5032B USB, GOSSEN Foto-and Lichtmesstechnik GmbH, Nürnberg, Germany). Participants sat in front of a screen, having their right and left index finger on the RTbox buttons. Each experimental block started with a five-point eye-tracker calibration and validation procedure.

Task and feedback
Participants were instructed to press the left button when a princess (or cat or butterfly) appeared on the screen and the right button when a knight (or hen or fish) appeared (see Fig. 1). They were asked to respond to the target stimuli as fast and correctly as possible and to ignore the sounds. Correct responses were followed by feedback, that is, the target moved toward the left or right side. For example, the princess moved to the palace on the left side and the knight moved to the fortress on the right side (Fig. 1). The feedback motion consisted of two images with a duration of 150 and 450 ms.

Trial and block structure
Each sound was presented with a fixed stimulus onset asynchrony (SOA) of 3300 ms (Fig. 1). 100 ms after sound offset, each sound was followed by a visual target. The target was presented for 500 ms. After target onset, participants had a 2 s time window to respond. The feedback was presented with a duration of 600 ms directly after the response, but not earlier than 200 ms after target offset. A total of six blocks were presented, each consisting of 40 trials. Two blocks included princesses and knights as target figure, two blocks of cats and hens and two blocks of butterflies and fish. The order of blocks containing different scenes were balanced across participants. Blocks containing the same scene were always presented one after another. Each block lasted about 2 min.

Sound sequence
The sound sequence included standard sounds (80%), emotional (10%) and neutral (10%) novel sounds. These probabilities of sound type presentation were equal over each block. That is, each of the six blocks included 32 standard sounds, 4 emotional novel sounds and 4 neutral novel sounds. In total, 192 standard sounds, 24 emotional novel sounds and 24 neutral novel sounds were presented. The sound sequence was unique for each participant. This ensured that potential changes in brightness were not systematically related to the occurrence of different sound types. For each scene (princesses vs. knights, cats vs. hens, butterflies vs. fish) a different standard sound was presented. This prevented potential effects of specific stimulus features of a single standard sound on performance. The assignment of standard sounds to the scene was counter-balanced across participants. The sound sequence was pseudo-randomized so that each novel was preceded by at least two standard sounds. Each novel was presented only once in total.

Training blocks
To familiarize participants with each of the three different scenes in the experimental block, three short training blocks including 8 trials each (6 standard sounds, 1 emotional and 1 neutral novel sounds) were performed. Sounds presented in the training blocks were not presented in the experimental blocks. If >50% of the trials was answered incorrectly, the training was repeated. Because the experiment was designed to be suitable for children, all participants of the present study understood the task promptly and no repetition was needed.

Data analysis
The first two standard trials per block are required for the formation of a predictive model of the upcoming stimuli (Bendixen, Roeber, & Schröger, 2007). Because the two standard trials immediately following an oddball sound can be affected by previous distractor sound processing (Wetzel, 2015), these were removed from all analyses. Only corresponding identical trials from the behavioral and pupil data, including a correct response, were used for analysis. Trials with incorrect or missing responses were excluded from pupil data analysis and trials with missing pupil data or blinks which could not be interpolated (see below) were also excluded from RT analysis.

Pupil data processing
The pupil diameter of both eyes was recorded with an infrared EyeLink Portable Duo eye-tracker (SR Research Ltd., Mississauga, Ontario, Canada). The eye tracking was set up in remote mode at a sampling rate of 500 Hz.
The eye-tracker automatically reports the number of pixels below a specific threshold as belonging to the pupil (in case area is recorded or in case diameter is recorded, as here, a transformation of area to diameter by: (256 * √ (Ā in pixel ÷ π)). By maintaining constant distance between the participant and the eye-tracker, the number of pixels actually reflects a meaningful and valid physical unit which can be converted to other meaningful units by simple linear transformations (e.g., mm; as described in several publications, for example Hayes & Petrov, 2016;Klingner, Kumar, & Hanrahan, 2008). We converted the eye tracker pupil diameter digital counts to mm as suggested by Steinhauer, Bradley, Siegle, Roecklein, and Dix (2022). Pupil size analysis was implemented with MATLAB software. Eye saccade and blink information were provided by the eye tracker. Partial blinks were detected during postprocessing from the smoothed velocity times series by an additional custom function, i.e., pupil diameter changes exceeding 20 mm/s including a 50 ms pre-blink and a 100 ms post-blink interval (Merritt, Keegan, & Mercer, 1994). We applied Kret and Sjak-Shie's (2019) dynamic offset algorithm to average data from both eyes. Isolated data segments between blinks or missing data shorter than 10 ms were considered as missing data. Subsequently, segments with blinks or missing data shorter than 1 s were interpolated with linear interpolation, longer segments were removed from the continuous data. Data were segmented in epochs of 2 s of duration (including a − 0.2 to 0 s prestimulus interval), baseline corrected by subtracting the mean amplitude of the baseline period (− 0.2 to 0.2 s) from each epoch. Typically, the pupil is not able to contract or dilate any earlier than 200 ms after stimulus onset (Mathôt, Fabius, Van Heusden, & Van der Stigchel, 2018). Thus, baseline correction was extended to range from − 0.2 to 0.2 s, which allows for a wider span of baseline activity. The mean PDRs were computed in a time window around the peak between 1.3 and 1.5 s for each trial and each participant. In addition, for each trial, the average pupil size in the baseline period was computed.

Behavioral data (reaction times, RTs)
Incorrect responses, responses faster than 100 ms after target onset and missing responses (or responses given later than 2 s after target onset) were excluded from RT and pupil analysis. Participants deviating >2 standard deviations from the average reaction times were excluded from the analysis. A total mean of 132 trials per participant (SD = 5.19; max = 140; min = 109) were used for analysis.

Statistical analysis 2.7.1. Analysis of condition effects
A paired samples t-test and Bayesian paired samples t-test were performed to compare PDR amplitudes in response to standard sounds, emotional novel and neutral novel sounds in the selected analysis mean amplitude time window (1.3-1.5 s). The same analysis was performed for the RTs in trials including standard sounds, emotional novel, and neutral novel sounds. All t-tests and Bayesian t-tests were performed using the R packages stats (v4.0.3, R Core Team, 2019) and BayesFactor (v0.9.12-4.2, R. D. Morey, Rouder, Pratte, & Speckman, 2011;Rouder, Speckman, Sun, Morey, & Iverson, 2009).

Analysis of statistical associations
We analyzed the relationship between RT and PDR both at trial and participant level with Linear mixed effect models (LMMs) to account for the dependencies between trials within participants. Trials were treated as primary unit of investigation (level 1) nested within participants (level 2). All models were estimated with the Maximum Likelihood method using the R packages lme4 (v1.1-27, Bates, Mächler, Bolker, & Walker, 2015), and lmerTest (v3.1-3, Kuznetsova, Brockhoff, & Christensen, 2017). As measures of goodness-of-fit model, we computed marginal and conditional R 2 (Nakagawa & Schielzeth, 2013), that is, the proportion of the total variability explained by the fixed effects and by all fixed and random effects together, respectively. Please note that relatively low values for R 2 are not uncommon due to the considerable variability of RTs between trials. Degrees of freedom for statistical tests were approximated using Satterthwaite's approximation. Bayes Factors were approximated from differences in the Bayesian Information Criterion, that is, Raftery, 1995). Specifically, we followed the logic of a "Type III" analysis of variance and computed the Bayes Factors from the comparison of the full model versus the full model excluding the respective effect.
We explored a range of conceivable models in which RT was modeled as a function of the various candidate predictors. To systemize the search for the best model, we applied a best subset selection and selected the best-fitting model using the Bayesian information criterion (BIC; Burnham & Anderson, 2004;Schwarz, 1978, Table 3). The set of candidate predictors contained various predictors at trial-and participant-level. Following from the experimental design, we always included Novelty (Standard vs. Distractor) and Emotionality (Neutral novel sound vs. Emotional novel sound) of the presented sounds as predictors. We applied a contrast coding such that the coefficient of Novelty (0 for standard, 1 for novels irrespective of the emotional content) is an estimate of the predicted difference in RT between standards and neutral novels whereas the coefficient of Emotionality (0 for standard sounds and neutral novels, 1 for emotional novels) reflects the predicted difference in RT between emotional and neutral novels. We included a random intercept (i.e., varying average RT) and a random slope for the predictor Novelty across participants (i.e., varying distraction effects). A random slope for Emotionality was not supported by the data and resulted in a singular fit. 2 We considered several potential relationships between pupil diameter and RTs: Both pupil diameter within the baseline period (baseline PD) and during the pupil dilation response (PDR) were used as potential predictors (LoTemplio et al., 2021;Murphy et al., 2011). The baseline PD was included in the selection process of the best model only to improve the model estimates (Alday, 2019) by controlling for potential confounding due to differences in baseline PD. This approach is comparable to an ANCOVA approach where covariates are includedalthough they are not of substantive interestto control for confounding. We did not interpret the resulting baseline effects as this would have gone beyond the scope of the manuscript (however, for possible interpretation see Supplementary material).
We considered that baseline PD and PDR can vary from trial to trial but there may also be systematic differences between participants and both these sources of variation could affect response times differentially. To give an intuition why this can happen: When the raw PDR takes a "large" value (e.g., relative to the grand mean) it remains unclear what "large" exactly implies, because large values could be due to the respective participant generally showing large PDRs or due to the specific trial showing a large PDR. If the raw baseline PD or PDR values were used as predictors, the trial and participant level effects of these predictors would be confounded and uninterpretable.
An established way to disentangle trial level from participant level variance is to create two variables: a trial level variable which is centered around the mean within participants and a participant level variable which represents the mean of each participant centered around the grand mean of all participants (Enders & Tofighi, 2007). The former variable represents the effect of fast fluctuations on trial level (e.g., do participants respond faster in trials with a larger PDR relative to the participant's individual average?). The later variable represents the effect of interindividual differences which are stable over the course of the experimental session (e.g., do participants with a generally larger PDR have larger behavioral distraction effects?). Both baseline PD and PDR were treated this way to separate the two sources of variation. We refer to the trial-level variables by the index "trial" (e.g., PDR trial ) and to the participant-level variables by the index "participant" (e.g., PDR participant ). The least complex model under consideration contained Novelty, Emotionality, PDR trial and Baseline trial as simple effects. The most complex model could include Novelty, Emotionality, linear, and quadratic effects of PDR and baseline PDR both at trial level and participant 3 level as well as their interactions with Novelty and Emotionality. Between these models, all possible alternative models were considered in the model space with two restrictions: (1) Any model containing a quadratic effect or interaction should also include the respective lower order ("simple") effect. (2) Any interaction including either Novelty or Emotionality should always be accompanied by an interaction with the respective other predictor because potential differences between the sound types are of genuine substantive interest to our study. All model effects specified based on the BIC selection are listed in Table 4. Except for a moderate skewness (2.29) in the level-1 residuals due to very slow responses in some single trials, all model assumptions were respected. We decided to keep these rare trials in the dataset because their removal would not have changed the results in any meaningful way given the large number of trials available for the model The parameters or predictors that are included in the model are marked with a "X".

RT-PDR-model
The best-fitting BIC-selected model contained the predictors Novelty, Emotionality, Baseline trial , 5 Baseline trial × Novelty, Baseline trial × Emotionality*, (Baseline trial ) 2 , PDR trial and PDR participant . The effects of the predictors Novelty and Emotionality showed that participants responded significantly slower to novel than to standard sounds but significantly faster to emotional novels compared to neutral novel sounds (Fig. 3, Panel A; Table 4, Novelty and Emotionality effects)resembling the results of the confirmatory analyses above. In addition, we found dissociable relationships between PDR and RT at trial and participant level. At trial level, slower RTs were predicted for trials with larger PDR (Fig. 4, Panel A; Table 4, effect PDR trial ), but participants with generally larger PDRs tended to respond faster (Fig. 4, Panel B; Table 4, effect PDR participant ). The model also revealed effects of the baseline PD on RTs and distraction effects that are described in detail in the Supplement material.
With respect to our research questions, the existence of an interaction of pupil dilation and behavioral distraction effects (either at trial or participant level) was of major interest. Therefore, we computed additional Bayes Factors comparing the BIC-selected model with models in which we added such interactions. At trial level, there was strong evidence against the inclusion of the terms PDR trial × Novelty and PDR trial × Emotionality into the BIC-model (BF = 0.002). At participant level, there was strong evidence against the inclusion of the terms PDR participant × Novelty and PDR participant × Emotionality (BF = 0.006). That is, the model did not support an interaction of these factors neither on the trial level nor on the participants level.

Discussion
This study investigated the direct relations of emotion-related distraction effects on performance in a primary task and increased levels of arousal evoked by processing of such emotional distractor sounds. Novel sounds, compared to standard sounds, prolonged RTs in a visual categorization task and evoked a transient dilation of the pupil. On the behavioral level, distraction effects were reduced in response to emotional compared to neutral novel sounds while the pupil dilated Statistically significant results are marked in bold. RT = reaction time; SE = standard error; df = degree of freedom; SD = standard deviation; Corr = correlation; PDR = pupil dilation response; BF = Bayes Factor. 4 We investigated the potential impact of the misspecified level-1-residual distribution by comparing our model with normal level-1 distribution with a model with an exgaussian distribution which can account for the considerable skewness of RTs at trial-level using brms (Bürkner, 2017(Bürkner, , 2018Carpenter et al., 2016) which utilizes a Bayesian estimation algorithm. The exgaussian model fit the data substantially better, but this did not affect any substantive conclusion, because none of the parameters changed its sign or effect size fundamentally. 5 These terms were added manually following the substantive restrictions outlined above. This change did not affect size or hypothesis test of any other effect in the model.

Fig. 2.
Grand-average pupil dilation responses (PDRs) for emotional novel sounds, neutral novel sounds, and standard sounds. Sound onset is at time point zero. Shading indicates the 95% confidence interval. The gray window indicates the time window used for analysis. Novel sounds evoked statistically significantly increased PDRs compared to standard sounds. Emotional novel sounds evoked statistically significantly increased PDRs compared to neutral novel sounds.

Fig. 3. Panel A:
Mean reaction time (RT) for standard, neutral novel and emotional novel sounds. Novel sounds evoked increased RTs compared to standard sounds, demonstrating a distraction effect. Emotional novel sounds caused reduced RTs compared to neutral novel sounds, indicating a facilitation effect. Panel B: Mean pupil dilation response for standard, neutral novel and emotional novel sounds. Novel sounds evoked larger PDRs compared to standard sounds. Emotional novel sounds evoked larger PDR compared to neutral novel sounds, indicating an increase in arousal. Panel C: Mean distraction effects (RT novel minus RT standard sound) and pupil dilation differences between PDR to novel and standard sound. This plot displays the hypothesized relationship between faster reaction times and larger pupil dilations to emotional novel sounds. This relation has been disconfirmed by the multilevel analysis. The plots show 95% confidence interval. even more in response to emotional novel sounds vs. neutral novel sounds. However, mixed-model effects could not provide any evidence for a correlation between performance and transient changes in pupil diameter that was specific to a sound's novelty or emotional content. This result was confirmed by Bayes Factors. Novel sounds impaired performance in a subsequent categorization task compared to standard sounds. This result is consistent with current models of distraction of attention (Corbetta & Shulman, 2002;Escera & Corral, 2007;Näätänen, 1992;Posner, 1980Posner, , 2016Sokolov, 1963). New, salient, and task-irrelevant events can involuntarily capture attention and can impair performance. This distraction effect (difference between RTs to distractor and RTs to standard sounds) has been observed in the auditory, visual, and tactile modality (Akatsuka et al., 2007;Escera, 1998;Schröger & Wolff, 1998) and has been replicated many times (Berti & Schröger, 2001Hoyer et al., 2021;Parmentier, 2014;Wetzel, Scharf, & Widmann, 2019). Task-irrelevant emotional novel sounds significantly decreased distraction effects compared to neutral novel sounds. Our results indicate that taskirrelevant emotional stimuli facilitated processing and improved behavioral performance in a subsequent task (Lindström & Bohlin, 2011;Lorenzino & Caudek, 2015;Max et al., 2015;Phelps, Ling, & Carrasco, 2006;Zeelenberg & Bocanegra, 2010). Similar effects have been observed in the visual modality as well (e.g., Kim et al., 2021;Kim & Anderson, 2020;Lee et al., 2012;Sutherland & Mather, 2015).
It is worth mentioning that the pupil can also partly reflect the activity of structures other than the LC such as the superior colliculus (SC, for review see Wang & Munoz, 2015) and the anterior cingulate cortex (ACC). Activity in such brain areas is coordinated with pupil fluctuations that are associated with certain aspects of cognitive processing, including attention and orienting to salient stimuli (Joshi et al., 2016;Wang & Munoz, 2015). However, LC and SC might subserve complementary functions (for review see Einhäuser, 2017), for example, the SC is part of a coeruleo-cortical pathway that modulates attentional functions (Wang & Munoz, 2015). All these multiple pathways may help to explain the more variable timing of pupil-related modulations of neuronal activity.
Even though emotional novel sounds evoked larger PDRs and reduced distraction effects separately, the applied multilevel model did not support a correlation between both effects neither on the trial nor on the participant level. The lack of a correlation was confirmed by the computation of Bayes Factors, which showed that the data provide strong evidence against such interactions. That is, emotional novel sounds evoking larger PDRs did not show systematically larger behavioral facilitation effects and participants showing larger average PDRs in response to emotional distractor sounds did not show correspondingly larger behavioral facilitation effects. Based on these results, we suggest that the emotion-related facilitation effect on the behavioral level and the increase in arousal reflected by the PDR do most likely not reflect the operation of identical processes. They are presumably caused by at least partly independent mechanisms. This does not exclude common precursor processes. It can be speculated that one of the involved processes does not show proportional behavior, for example due to all-or-nothing effects or ceiling or floor effects. Taken together, we propose that our behavioral and psychophysiological results indicate the operation of possibly related, but not identical mechanisms contributing to emotionrelated decreased effects of distraction.
Even though we did not find an emotion-specific correlation between reduced distraction effects and increased PDR, our exploratory analysis showed two opposite relationships between RTs and PDRs at the trial and the participant level, independent of the sound type presented. On a participant level, participants with larger mean PDR, responded faster to target stimuli. Behaviorally relevant stimuli can dilate the pupil (e.g., Beatty, 1982;Murphy et al., 2014). The negative correlation could indicate that participants with increased PDR have continuously and more effectively used the sounds as a temporal cue for both the occurrence and the timing of the upcoming target (Hackley, 2009;Hackley & Valle-Inclán, 2003) and effectively prepared for the onset of the to-becategorized stimulus (Volosin, Grimm, & Horváth, 2016;Wetzel, Schröger, & Widmann, 2013). This can result in faster responses compared to participants who were less engaged in the task. Strauch and colleagues suggested that pupil dilation might be interpreted as a readout of all three attentional subsystems, alerting, orienting and executive attention as suggested by Petersen and Posner (2012). Following this suggestion, the negative correlation at participant level could also reflect higher-level attentional factors related to the executive functions: Participants with larger PDR employed more attentional resources because they were more engaged in performing the task. In both cases, the negative correlation between RTs and PDRs might reflect participant-level aspects of task engagement (see e.g. also Hopstaken, van der Linden, Bakker, & Kompier, 2015).
At the trial level, we observed a positive correlation between RTs and PDRs, that is, in trials with slower reaction times we observed larger PDRs in the same trial irrespective of sound type (as previously reported by Murphy et al., 2011). Again, following the suggestion by Strauch, Wang, Einhäuser, Van der Stigchel, and Naber (2022), this positive correlation could reflect intermediate-level factors related to alerting and orienting of attention (Petersen & Posner, 2012): at a trial level, pupil dilation indicates orienting and distraction of attention in response to stimuli occurring in the surrounding. Interestingly, as we did not observe an interaction effect of trial-level PDR and novelty, the relation slower RT-larger PDR apparently might also hold for standard trials. We suggest that attentional orienting and enhanced stimulus processing observable in distracting novel trial at larger scales might also occur in standard trials at smaller scales, for example in relation to increased phasic NE release, potentially due to attentional orienting toward sound stimulation and spontaneous fluctuations of the LC activity (Jepma & Nieuwenhuis, 2011), resulting in enhanced processing of the current sound (and vice versa; Aston-Jones & Cohen, 2005) also in standard trials. The enhanced processing of the task-irrelevant standard and novel sounds can impair subsequent target stimulus-related processes, resulting in increased RT at trial-level.
Since PDR and RT on a trial level consider deviations relative to the participants' individual averages whereas participant level RT and PDR consider deviations of the participants individual averages from the grand average, these correlations represent differentiable sources of variance and the relationships can point in different directions (Enders & Tofighi, 2007; see also LoTemplio et al., 2021 for the relation of P3b and RT in an oddball task). More generally, the trial-to-trial fluctuation of activity could reflect brain processes specific to that stimulus-driven behavior, whereas a difference between participants could reflect a general individual response bias to incoming stimuli. Our results demonstrate that the centering strategies, common in multilevel models, can also be applied effectively to disentangle enduring and transient effects in experimental settings.

Conclusion
Our findings indicate that task-irrelevant and unexpected novel sounds impair performance in a categorization task and distraction effects are reduced in response to emotional compared to neutral novel sounds. Transient changes in pupil size are larger in response to novel sounds compared to standard sounds and this increase is larger for emotional than for neutral novel sounds. Our frequentist and Bayesian results disconfirm our hypothesis of a direct relation between reduced distraction effects on the behavioral level and increased arousal reflected by larger PDR to emotional novel sounds. We suggest that both performance and pupil diameter reflect partly distinct processes. Given that the PDR has been discussed to indirectly reflect the activity of the locus coeruleus, the LC-NE system may embody a common antecedent for both effects, spreading norepinephrine to cortical areas involved in attention control and control of the pupil. In addition, the observed emotion-unspecific correlations between performance (RT) and levels of arousal (PDR), that differ on the trial and the participants level, provide new insights into the underlying mechanisms of potential fluctuation of the LC-NE system, aspects of individual task engagement and their effects on performance.
At last, the present paradigm makes use of ethically harmless stimuli which do not cause anxiety, fear or threat. Thus, it can be applied in the research of a relation between attentional distraction and arousal in developmental and clinical studies.

Author notes
We thank Dunja Kunke and Tjerk Dercksen for proofreading and Carolin Albrecht for her assistance in investigating the cause for the singular model fits. We are grateful for support in data acquisition by Dunja Kunke and Gabriele Schoeps. We thank all participants for support. The project was funded by the DFG (WE 5026/1-2), the Center for Behavioral Brain Sciences Magdeburg funded by the European Regional Development Fund (ZS/2016/04/78120), and the Leibniz Association (P58/2017).

Data availability
Data will be made available on request.

Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi. org/10.1016/j.cognition.2023.105470.