Behavioural and neural signatures of perceptual evidence accumulation are modulated by pupil-linked arousal

The timing and accuracy of perceptual decision making is exquisitely sensitive to fluctuations in arousal. Although extensive research has highlighted the role of neural evidence accumulation in forming decisions, our understanding of how arousal impacts these processes remains limited. Here we isolated electrophysiological signatures of evidence accumulation alongside signals reflecting target selection, attentional engagement and motor output and examined their modulation as a function of both tonic and phasic arousal, indexed by baseline and task-evoked pupil diameter, respectively. For both pupillometric measures, the relationship with reaction time was best described by a second-order, U-shaped, polynomial. Additionally, the two pupil measures were predictive of a unique set of EEG signatures that together represent multiple information processing steps of perceptual decision-making, including evidence accumulation. Finally, we found that behavioural variability associated with fluctuations in both tonic and phasic arousal was largely mediated by variability in evidence accumulation.


29
The speed and accuracy with which humans, as well as non-human animals, respond to a stimulus 30 depends not only on the characteristics of the stimulus, but also on the cognitive state of the subject. 31 When drowsy, a subject will respond more slowly to the same stimulus compared to when she is 32 attentive and alert. Central arousal also fluctuates across a smaller range during quiet wakefulness, 33 when the subject is neither drowsy or inattentive, nor overly excited or distractible. Although these 2 trial-to-trial fluctuations can impact on behavioural performance during decision-making tasks 35 (Aston-Jones and Cohen, 2005), it is largely unknown how arousal modulates the underlying 36 processes that support decision formation. Perceptual decision-making depends on multiple neural 37 processing stages that represent and select sensory information, those that process and accumulate 38 sensory evidence, and those that prepare and execute motor commands. Variability in central arousal 39 could affect any one or potentially all of these processing stages, which in turn could influence 40 behavioural performance. 41 The neuromodulatory systems that control central arousal state, such as the noradrenergic 42  Figure 1D). We then used sequential multilevel model analyses and 117 maximum likelihood ratio tests to test for fixed effects of pupil bin. We determined whether a linear 118 fit was better than a constant fit and subsequently whether the fit of a second-order polynomial (e.g, 119 U-shaped relationship), indicating a non-monotonic relationship between pupil diameter and 120 behaviour/EEG, was superior to a linear fit. 121 Both tonic and phasic arousal are predictive of task performance in a U-shaped 122 manner 123 We first investigated the relationship between trial-by-trial pupil dynamics and behavioural 124 performance. As stimuli were presented well above perceptual threshold, our subjects performed at 125 ceiling (Newman et al., 2017). We therefore focused on RT and the RT coefficient of variation 126 (RTcv), a measure of performance variability calculated by dividing the standard deviation in RT by 127 the mean (Bellgrove et al., 2004), rather than accuracy. We found that both measures of behavioural 128 performance displayed a non-monotonic, U-shaped, relationship with both baseline pupil diameter 129 (RT χ 2 (1) = 8.98, p = 0.003; RTcv χ 2 (1) = 5.36, p = 0.020) and the pupil diameter response (RT χ 2 (1) = 130 116.65, p < 0.001; RTcv χ 2 (1) = 12.36, p < 0.001). Responses were fastest and least variable for 131 intermediate pupil bins (Figure 1C & Figure 1E). We repeated this analysis in single-trial, non-binned 132 data, in which we additionally controlled for time-on-task effects, confirming that these effects were 133 not dependent on the binning procedure (Supplementary information). Additionally, we noticed that 134 when we band-pass filtered the pupil diameter, rather than low-pass filtered, the relationship between 135 baseline pupil diameter and task performance was not significant, whereas this did not affect the 136 relationship between the pupil response and task performance (Supplementary figure 1). This suggests 137 that slow fluctuations in baseline pupil diameter (<0.01Hz) are driving the effect on task performance. 138 Having established a relationship between task performance and both tonic and phasic modes 139 of central arousal state, we next focused on the relationship between these pupil dynamics and the found that the onset latency of the evidence accumulation process, defined as the first time point that 152 showed a significant difference from zero for 15 consecutive time points, displayed a quadratic 153 relationship with the size of the pupil response (χ 2 (1) = 7.53, p = 0.006), such that the fastest onsets 154 were found for intermediate pupil response bins and slower onsets for the extreme bins ( Figure 2A). 155 Likewise, the slope of the CPP, reflecting the build-up rate of evidence accumulation, also displayed a 156 non-monotonic, inverted-U shaped, relationship with the pupil response (χ 2 (1) = 7.81, p = 0.005). The 6 amplitude of the CPP, representing the threshold of the accumulation process, did not vary with the 158 pupil diameter response (p = 0.24). We thus found a direct relationship between phasic arousal and 159 the onset and build-up rate of evidence accumulation. Moreover, the non-monotonic relationship with 160 the neural parameters of the CPP closely resembled the relationship between the pupil response and 161 7 behavioural performance ( Figure 1E) . Because the membrane potential of sensory neurons shows the  162   least variance and highest response reliability at intermediate baseline pupil diameter (McGinley et  163 al., 2015a), we additionally investigated the ITPC, a measure of across trial consistency, of the CPP. 164 We computed ITPC with a single-taper spectral analysis in a 512 ms sliding window computed at 50 165 ms intervals, with a frequency resolution of 1.95 Hz (Materials and Methods). Based on the stimulus-166 locked grand average time-frequency spectrum, we selected a time (300-550 ms) and frequency 167 window (<4 Hz) for further statistical analyses ( Figure 2C). We found a quadratic (inverted U-shape) 168 relationship between pupil diameter response and the consistency of the CPP signal (χ 2 (1) = 30.42, p < 169 0.001), indicating that the CPP signal is less variable for intermediate pupil response bins ( Figure 2D). 170 Together, these results confirm the hypothesized relationship between the pupil diameter response and 171 electrophysiological correlates of evidence accumulation. Next, we asked whether other stages of 172 information processing underpinning perceptual decision making also varied with the pupil response. 173

174
The phasic pupil response relates monotonically to spectral measures of baseline 175 attentional engagement and displays a U-shaped relationship with motor output 176 We next investigated pre-target preparatory α-band power (8-13 Hz), a sensitive index of attentional 177 deployment that has been shown to vary with behavioural performance. Specifically, previous studies 178 have found higher pre-target α-band power preceding trials with longer RT, and that fluctuations in α-

9
(Loughnane et al., 2016). Because of the spatial nature of the task, we analysed the negative 203 deflection over both the contra-(N2c) and ipsi-lateral (N2i) hemisphere, relative to the target location. 204 The pupil response was not predictive of any aspect of the N2c. Specifically, phasic arousal was not 205 predictive of N2c latency (p = 0.66) or amplitude (p = 0.39), nor did we find any relationship between 206 the pupil response and the N2c ITPC (p = 0.57). Although the pupil response was not predictive of 207 N2i latency (p = 0.53) or ITPC (0.69), it was predictive of N2i amplitude (χ 2 (1) = 6.94, p = 0.008). 208 Previously, we showed that the N2c, rather than N2i, correlated with RT and modulated CPP 209 (Loughnane et al., 2016). It is therefore interesting that N2i, rather than N2c varied with the pupil 210 response. Below, we will discuss whether this effect could (partially) explain the relationship between 211 the pupil response and task performance. To test this possibility, we further investigated the relationship between pupil responses and 230 the ipsilateral N2 target selection signal ( Figure 3D). If on trials with lower behavioural performance 231 attention was focused on the distractor stimulus, then early target selection signals contralateral to the 232 distractor stimulus (i.e. ipsilateral to the target stimulus) might differ compared to trials with relatively 233 better performance. Additionally, these differences might be present throughout the trial, before the 234 N2i is expected to reveal differences between target and non-target stimuli (Loughnane et al., 2016). 235 We therefore conducted a sliding window linear mixed effect model analysis predicting N2i 236 amplitude for each 100 ms window, in 10 ms increments, from -20 before to 500 ms after target onset 237 ( Figure 4A). This analysis revealed that the pupil diameter was predictive of N2i amplitude from as 238 early as 70 ms after target onset, much earlier than the previously reported target selection onset of 239 308 ms (Loughnane et al., 2016), and therefore unlikely to reflect target processing. Rather, large 240 pupil responses and a large N2 amplitude could reflect a bias in attention or expectation of the target 241 N2i responses) with the pupil diameter response (χ 2 (1) = 6.91, p = 0.009), with larger pupil responses 248 for larger N2i amplitudes ( Figure 4B). This suggests that attentional shifts, possibly through 249 recruitment of the SCi, could lead to larger pupil responses and lower behavioural performance. 250 To further investigate whether these effects could explain (part of) the current results, we 251 analysed the relationship between the pupil diameter response and task performance from a different 252 dataset (Loughnane et al., 2018) in which participants (n = 17) monitored a single, centrally 253 positioned, flickering checkerboard annulus for a gradual change in contrast ( Figure 4C). Pupil 254 diameter on this non-spatial task also displayed across-trial variability ( Figure 4D), which predicted 255 RT in a non-monotonic fashion (χ 2 (1) = 8.85, p = 0.003). RTcv did not scale with the pupil diameter 256 response ( Figure 4E) (p = 0.24). We furthermore confirmed that the non-monotonic relationship 257 between the pupil response and RT was not dependent on the binning procedure or time-on-task 258 effects by repeating this analysis on single-trial data in which we controlled for these factors 259 (Supplementary Table 1). Thus, the U-shaped relationship between the pupil diameter response and 260 behaviour (RT) cannot be attributed to attentional shifts away from a distractor stimulus and may be a 261 more general phenomenon during protracted perceptual decision-making. task, reflect a correction from a state with low performance. Indeed, trials with maximum pupil 273 dilations and low task performance have been found to be preceded by trials with progressively longer 274 RT, and followed by better task performance (Murphy et al., 2011). We therefore tested whether trials 275 with a large pupil diameter response were preceded/followed by trials with worse/better task 276 performance. Figure 4F-G shows the RT for trials relative to the trial on which the pupil response was 277 measured (trial 0). Trials with larger pupil responses (bin 5, green) were preceded by trials with 278 slower than average RT ( Figure 4F), and this effect was observed for up to 3 trials before trial 0. 279 Although on the subsequent trial (trial 1) RT was still slower than average, RT was significantly faster 280 compared to trial 0 ( Figure 4G). Additionally, the trials for which the pupil response is largest are the 281 only trials on which there is both a decrease in task performance (increase in RT) from the previous 282 trial, and a subsequent improvement in performance on the next trial. The other bins displayed the 283 exact opposite pattern, and none of them showed an improvement in task performance after trial 0. A 284 phasic pupillary response could thus indicate a compensatory mechanism, signalling the need to 285 adjust the neural circuitry to a state that facilitates better performance. As Murphy et al. (2011)  286 concluded, large pupil responses may reflect phasic activations driven by higher cortical performance 287 monitoring brain regions that serve to reengage participants in the task. 288

289
The impact of phasic arousal on task performance is mainly mediated by the 290 consistency in evidence accumulation 291 Regardless of the neural mechanism, we found that pupil-linked phasic arousal was predictive of 292 specific neural signals at multiple information processing stages of perceptual decision making. To 293 test which of these signals explained unique variability in behavioural performance across the 5 pupil 294 response bins and subjects, the neural signals were added to a linear mixed effects model predicting 295 either RT or RTcv with their order of entry determined hierarchically by their temporal order in the 296 decision-making process. This allowed us to test whether each successive stage of neural processing 297 would improve the fit of the model to the behavioural data, over and above the fit of the previous 298 stage. Note that none of the predictors were highly correlated (r < 0.25), with the exception of CPP 299 onset and CPP ITPC (r = 0.43), CPP build-up rate and CPP amplitude (r = -0.59), and LHB build-up 300 rate and amplitude (r = -0.28). Compared to the baseline model predicting RT with pupil bin, the 301 addition of pre-target α-power significantly improved the model fit (χ 2 (1) = 10.63, p < 0.001). None of 302 the measures of early target selection improved the fit of the model; neither N2c latency (χ 2 (1) = 0.75, 303 p = 0.39) or amplitude (χ 2 (1) = 0.47, p = 0.49), nor N2i latency (χ 2 (1) = 0.90, p = 0.34) or amplitude 304 (χ 2 (1) = 2.34, p = 0.13). We found that both the addition of CPP onset (χ 2 (1) = 27.24, p < 0.001) as well 305 as the build-up rate (χ 2 (1) = 11.74, p < 0.001) significantly improved the model fit. Whereas the 306 addition of CPP amplitude did not (χ 2 (1) = 3.19, p = 0.07), the addition of CPP ITPC substantially 307 improved the fit of the model (χ 2 (1) = 40.60, p < 0.001). Although both LHB amplitude and build-up 308 rate varied with phasic arousal, neither improved the fit of the model (LHB build-up rate χ 2 (1) = 2.09, 309 p = 0.15; amplitude χ 2 (1) = 0.59, p = 0.44). Overall, this model suggested that pre-target α-power, CPP 310 onset, build-up rate and ITPC exert partially independent influences on RT. Because some variables 311 were highly correlated (e.g. CPP onset and ITPC) we used an algorithm for forward/backward 312 stepwise model selection (Venables and Ripley, 2002) to test whether each neural signal indeed 313 13 explained independent variability that is not explained by any of the other signals. This procedure 314 eliminated CPP onset from the final model (F(1) = 2.60, p = 0.11). Thus, only pre-target α-power, CPP 315 build-up rate and CPP ITPC significantly improved the model fit for predicting RT. These three 316 variables were forced into one linear mixed effects model predicting RT (Statistical analyses), and 317 comparison to a baseline model revealed a good fit (χ 2 (3) = 82.18, p < 0.001). The fixed effects of the 318 model (the neural signals) explained 14.6% of the variability in RT (marginal r 2 ) across the 5 pupil 319 response bins, and together with the random effects (across subject variability) it explained 93.1% of 320 the variability (conditional r 2 ). 321 We performed the same hierarchical regression analysis to see which neural signals explained 322 variability in RTcv. We summarised the results of this analysis in Supplementary RTcv. Comparison against a baseline model revealed a significant fit (χ 2 (1) = 19.78, p < 0.001) that 327 had a marginal r 2 of 11.1% and a conditional r 2 of 46.5%. 328 Table 1 shows the final parameter estimates for the neural signals that significantly predicted 329 variability in RT or RTcv that is due to variability in phasic arousal. From this analysis we can 330 conclude that CPP ITPC was the strongest predictor for RT and the only predictor for RTcv. These Next, we turn to tonic arousal and its relationship to these same EEG components of 336 perceptual decision-making. 337 14 Baseline pupil diameter is inversely related to the consistency of evidence (1) = 4.40, p = 0.036). This suggests that with higher tonic arousal, alpha 357 activity is higher (or less desynchronised). Next, we tested whether baseline pupil diameter was 358 predictive of EEG characteristics representing motor output ( Figure 6B). We found an approximately 359 linear relationship with LHB build-up rate (χ 2 (1) = 11.1, p < 0.001), decreasing with larger baseline 360 pupil diameter, but we did not find a relationship with LHB amplitude (p = 0.18). 361 Lastly, we investigated whether baseline pupil diameter affected our early target selection 362 signal, the N2 ( Figure 6C-D 16 that higher arousal has a negative impact on sensory encoding. N2c ITPC did not vary with baseline 368 pupil diameter (p = 0.30), and nor did N2i ITPC (p = 0.26), N2i latency (p = 0.87) or amplitude (p = 369 0.06). We thus found that, similar to the phasic pupil diameter response, baseline pupil diameter is 370 predictive of specific characteristics of each of the processing stages of perceptual decision-making. 371 Next, we investigated which of these components explained unique variance in task performance 372 across pupil size bins. 373 374 Consistency in evidence accumulation mediates the influence of tonic arousal on 375 task performance 376 We again performed the same hierarchical regression analysis as described above, to see which of the 377 neural signals explained unique variability in task performance associated with tonic arousal. The full 378 results of this analysis are summarised in Supplementary Table 3. Here we discuss the main findings. 379 After the application of a forward/backward model selection algorithm (Venables and Ripley, 2002), 380 N2c amplitude and CPP ITPC were the only parameters that were predictive of RT (Table 1). These 381 variables were forced into one regression model predicting RT, and comparison against a baseline 382 model with baseline pupil diameter as a factor revealed a significant fit (χ 2 (2) = 32.6, p < 0.001) with a 383 marginal (conditional) r 2 of 4.2% (94.4%). This same hierarchical regression procedure revealed that 384 CPP ITPC was the only EEG component that explained unique variability in RTcv (Table 1). 385 Comparison against a baseline model also led to a significant fit (χ 2 (1) = 26.59, p < 0.001), with a 386 marginal (conditional) r 2 of 11.7% (43.3%). 387 Thus, additional to a small effect of N2c amplitude on RT, the consistency of the evidence 388 accumulation process was the only stage of information processing that explained unique within and 389 across-subject variability in task performance associated with changes in baseline pupil diameter. 390 391 During decision-making, baseline pupil diameter does not always predict task 392 performance in a U-shaped manner 393 Other than a small non-monotonic relationship with pre-target α power, none of the relationships 394 between baseline pupil diameter and the other EEG components was best described by a quadratic 395 polynomial. We therefore asked whether the U-shaped relationship with task performance is a general 396 phenomenon during decision-making. To this end, we again analysed the data from a different dataset 397 using a contrast change detection paradigm where subjects monitored a single central target stimulus 398 (Loughnane et al., 2018). Here, we found a small non-monotonic relationship between baseline pupil 399 diameter and RT (χ 2 (1) = 4.33, p = 0.038), and no relationship with RTcv (p = 0.13) ( Figure 6E-F). 400 Because of the small size of the non-monotonic effect with RT, we repeated this analysis in single-401 trial, non-binned data, to investigate whether this effect arose from the binning procedure, or time-on-402 task effects (Supplementary Table 1). This analysis revealed a monotonic relationship between 403 baseline pupil diameter and RT (χ 2 (1) = 8.21, p = 0.004), but no non-monotonic relationship (p = 0.18). 404 Because the non-monotonic relationship was not found using single trial data, we additionally plotted 405 the inverse monotonic relationship between baseline pupil diameter and RT for the binned data 406 ( Figure 6F). 407 It thus seems that on this task, higher levels of central arousal, as opposed to intermediate 408 levels, are associated with improved task performance. 409

410
Here we investigated whether behavioural and neural correlates of decision-making varied as a 411 function of baseline or task-evoked pupil diameter. The perceptual decision-making paradigm 412 employed ( Figure 1A) allowed us to monitor the relationship between pupil diameter and independent 413 measures of attentional engagement, early target selection, evidence accumulation and motor output. 414 We found that the trial-by-trial variability in both tonic and phasic arousal, as measured by the size of 415 the baseline pupil diameter and pupil response ( Figure 1B-D), respectively, were predictive of 416 behavioural performance. This relationship was best described by a second-order, U-shaped, 417 polynomial fit for both RT as well as the variability of RT, RTcv ( Figure 1C-E). 418 We furthermore established that both tonic and phasic arousal were predictive of a subset of 419 EEG signatures, together reflecting discrete aspects of information processing underpinning 420 perceptual decision-making. A hierarchical regression analysis allowed us to determine which of these 421 processing stages exerted an independent influence on behavioural performance associated with 422 central arousal. We found that pre-target α power, indexing baseline attentional engagement, and the 423 build-up rate and consistency of the CPP, reflecting the evidence accumulation process, each 424 explained unique variability in task performance that was due to variability in phasic arousal. 425 Variability in task performance due to variability in tonic arousal, was explained by the amplitude of 426 the target selection signal N2c and the consistency of the CPP. 427 We thus revealed a direct relationship between both tonic and phasic measures of arousal, and 428 a distinct but overlapping set of EEG signatures of perceptual decision-making. this study, we found a strong non-monotonic, U-shaped, relationship between phasic pupil dilations, 436 behavioural performance and EEG signatures during a decision-making task. Here, the largest 437 pupillary constriction and dilation were associated with the poorest behavioural performance, whereas 438 a modest dilation was associated with the best performance. 439 Arousal determines the way a subject interacts with its environment. Intermediate arousal 440 allows for optimal interaction with the task at hand, whereas suboptimal performance is observed 441 when the subject is either too drowsy or too excitable/distractible (Yerkes and Dodson, 1908 Figure 4A). This difference was present as 484 early as 70ms after target onset, making it unlikely that it reflects target processing. Rather, large 485 pupil responses, accompanied by larger N2i amplitudes, could indicate that attention was more biased 486 towards one of the stimuli. Trials where the non-attended stimulus turned out to be the target would 487 require a shift in attention, which in turn could be the cause of the delay in response. Indeed, we found 488 an inverse relationship between the N2i amplitude and the size of the pupil response ( Figure 4B), 489 suggesting that the need for an attentional shift elicits large pupil responses, which could explain the 490 lower behavioural performance on trials with larger pupil dilations. To see whether attentional shifts 491 could be the sole mechanism by which to explain the U-shaped relationship between pupil response 492 size and task performance, we analysed data from a different experiment in which participants 493 monitored a single stimulus ( Figure 4C-E). On this task, although pupil responses did not relate to 494 RTcv, they were predictive of RT in a quadratic manner. The lack of relationship with RTcv on the 495 contrast change detection task implies that variability in RT on the motion detection task could be 496 brought about by shifts in attention, and thus explain the U-shaped relationship between the pupil 497 response and RTcv. However, the quadratic relationship with RT indicates that shifts in attention 498 cannot be the sole cause of the U-shaped relationship between the pupil response and task 499 performance, and that this might be a more general phenomenon during protracted decision-making.  Rajkowski et al., 1994Rajkowski et al., , 518 2004). Instead, on trials where discrimination is more difficult and RT latencies are longer, the LC 519 response is delayed (Rajkowski et al., 2004), which would bring about a delay in pupil dilation rather 520 than an immediate, larger response. Although at odds with these studies, Muprhy et al., (2011) 521 previously described a similar relationship, in which trials with large pupil responses were preceded 522 by progressively worse performance which was subsequently followed by better task performance. 523 This finding was interpreted as a compensatory mechanism, driven by cortical performance 524 monitoring brain regions that, via a phasic LC response, possibly reflect a reset of the network 525 (Bouret and Sara, 2005) to reengage participants in the task. Another possible neural mechanism that 526 may lead to the same behavioural outcome is that this effect is driven by cholinergic transients that 527 have been hypothesized to signify a switch from a 'signal-detection down' to a 'signal-detection up 528 state', facilitating target detection (Sarter et al., 2016). build-up rate, as well as an inverse relationship with CPP ITPC. Of these, only N2c amplitude and 550 CPP ITPC explained within and across subject variability in task performance (Table 1). It thus seems 551 that the effect of tonic arousal on task performance is mainly driven by an approximately linear 552 relationship with target selection and evidence accumulation consistency. This led us to question 553 whether a U-shaped relationship between tonic arousal and task performance on protracted visual 554 decision-making tasks is a more general phenomenon, or heavily dependent on specific aspects of the 555 behavioural paradigm. The absence of a non-monotonic relationship between baseline pupil diameter 556 and task performance during contrast change detection ( Figure 6F) suggests the latter. These 557 differences could be driven by different task demands; on simple tasks performance may benefit from 558 increases in arousal, whereas optimal performance on more difficult discrimination tasks could be 559 found with intermediate arousal (Yerkes and Dodson, 1908;McGinley et al., 2015b). RT was 560 however substantially longer on the task where we did not find a U-shaped relationship (compare 561 Figure 1E & Figure 6F), suggesting that this task was more demanding. Alternatively, the relationship 562 between tonic arousal and task performance could be contingent on attentional demands. On tasks 563 with longer RT that require accumulation of evidence across a longer time-period, greater sustained 564 attention is required, which could benefit from increased arousal and would thus predict an inverse 565 linear relationship between baseline pupil diameter and performance ( Figure 6F). 566 Depending on the behavioural paradigm and task demands, the relationship between central 567 arousal, performance and neural activity may take different forms (McGinley et al., 2015b). 568 Membrane potential recordings from sensory and association areas, as well as direct 569 electrophysiological recordings from neuromodulatory brainstem centres during protracted decision-570 making, are needed to gain further insight in the exact mechanisms that drive the relationship between 571 cortical state, sensory encoding, evidence accumulation and task performance. the dense LC innervation of the neural areas thought to be its source, the P3 has been hypothesized to 597 reflect the LC phasic response (Nieuwenhuis et al., 2005). It thus seems likely that the CPP, and 598 therefore evidence accumulation, is also influenced by LC activity. Likewise, it seems plausible that 599 ACh affects attentional processes/evidence accumulation in parietal cortex, and thus also the CPP. Here we found that the consistency of evidence accumulation was the main EEG predictor of 625 variability in task performance associated with both tonic and phasic arousal. For tonic arousal, 626 although CPP ITPC did not follow the same U-shaped relationship as task performance, our findings 627 are largely in line with modelling studies which suggested that higher arousal is specifically predictive 628 of more variability in evidence accumulation (Murphy et al., 2014b). For phasic arousal, higher 629 consistency, and thus less variability, was found for intermediate pupil bins, which also displayed the 630 best behavioural performance. These results suggest that similar neural mechanisms of cortical state 631 , to the best of our knowledge, the influence of pupil-linked arousal on target selection signals 645 has not been described before. Here, we showed that early target selection signals are modulated by 646 tonic arousal such that larger baseline pupil diameter was predictive of smaller N2c amplitudes 647 ( Figure 6C). Moreover, the amplitude of the N2c also explained unique variability in task 648 performance across pupil bins and subjects (Table 1). 649 At first glance it seems counterintuitive that target selection amplitudes are decreased, 650 whereas visual encoding in early visual cortex is enhanced on trials with larger baseline pupil 651 diameter (Vinck et al., 2015), or during pupil dilation (Reimer et al., 2014). These differences could 652 be due to differences in the nature of the recordings, as these previous studies used invasive 653 electrophysiology and calcium imaging whereas we used scalp EEG, limiting especially the spatial 654 resolution of our analyses that might be necessary to elucidate these effects (e.g. single neuron 655 orientation tuning). Alternatively, they could constitute differential effects of arousal on visual 656 encoding and target selection. More likely, however, they are due to specific task demands, in 657 particular our use of multiple simultaneously presented competing stimuli. Indeed, there is some 658 evidence that an increase in arousal, as measured by pupil diameter, can increase the ability of a 659 distractor to disrupt performance on a Go/No-Go task in non-human primates (Ebitz et al., 2014). At 660 high arousal levels, performance might thus be negatively affected when the task requires the 661 successful suppression of distracting information, i.e. with higher arousal it is more difficult to focus 662 on the task at hand (Aston-Jones and Cohen, 2005; McGinley et al., 2015b). On the current task, it 663 might thus be more difficult to select and process information from one of the two competing stimuli 664 during states of high arousal, leading to reduced N2c amplitude as well as reduced performance. 665 In addition to the effects on tonic arousal on the N2c, we found that phasic arousal was 666 predictive of the amplitude of the N2i ( Figure 3D). However, because this effect was not restricted to 667 the time period around the peak latency, and present from as early as 70 ms ( Figure 4A), it is unlikely 668 to reflect target selection (see above). Rather, it seems plausible that these differences reflect 669 differences in the expected location of target presentation. Thus, our observation that phasic arousal 670 was not predictive of any aspect of target selection is broadly consistent with (de Gee et al., 2017), 671 who found that the pupil response was not predictive of sensory responses. 672 673 Concluding remarks 674 In this study we investigated the relationship between measures of tonic and phasic pupil-linked 675 arousal and behavioural and EEG measures of perceptual decision-making. We found that trial-to-trial 676 variability in both tonic and phasic arousal accounted for variability in task performance and were 677 predictive of a unique, but overlapping, set of neural metrics of perceptual decision-making. These 678 results confirm our hypothesized relationship between pupil diameter and the electrophysiological 679 correlates of evidence accumulation, providing further support for the notion that the neuromodulators 680 that control central arousal are recruited throughout the decision making process. Moreover, the 681 relationships with task performance were best described by a second-order, U-shaped, polynomial 682 model fit, indicating that during decision-making there are optimal levels of both tonic and phasic 683 activity in the (network of) neuromodulatory centres that control central arousal. Although we found 684 that pupil-linked arousal was predictive of EEG correlates associated with attentional engagement, 685 target selection, evidence accumulation and motor output, the effects of arousal on behavioural 686 performance are mainly mediated through the consistency in evidence accumulation. 687 To confirm whether each of the neural signals selected by the hierarchical regression analysis indeed 1184 had a significant effect on task performance, we performed a robust regression (Supplementary Table  1185 4) based on 5000 bootstrap replicates to calculate the 95% confidence intervals around the β 1186 parameter estimates for the final model fit (Table 1)