Neural precursors of deliberate and arbitrary decisions in the study of voluntary action

The readiness potential (RP)—a key ERP correlate of upcoming action—is known to precede subjects’ reports of their decision to move. Some view this as evidence against a causal role for consciousness in human decision-making and thus against free-will. Yet those studies focused on arbitrary decisions—purposeless, unreasoned, and without consequences. It remains unknown to what degree the RP generalizes to deliberate, more ecological decisions. We directly compared deliberate and arbitrary decision-making during a $1000-donation task to non-profit organizations. While we found the expected RPs for arbitrary decisions, they were strikingly absent for deliberate ones. Our results and drift-diffusion model are congruent with the RP representing accumulation of noisy, random fluctuations that drive arbitrary—but not deliberate—decisions. They further point to different neural mechanisms underlying deliberate and arbitrary decisions, challenging the generalizability of studies that argue for no causal role for consciousness in decision-making to real-life decisions. Significance Statement The extent of human free will has been debated for millennia. Previous studies demonstrated that neural precursors of action—especially the readiness potential—precede subjects’ reports of deciding to move. Some viewed this as evidence against free-will. However, these experiments focused on arbitrary decisions—e.g., randomly raising the left or right hand. We directly compared deliberate (actual $1000 donations to NPOs) and arbitrary decisions, and found readiness potentials before arbitrary decisions, but—critically—not before deliberate decisions. This supports the interpretation of readiness potentials as byproducts of accumulation of random fluctuations in arbitrary but not deliberate decisions and points to different neural mechanisms underlying deliberate and arbitrary choice. Hence, it challenges the generalizability of previous results from arbitrary to deliberate decisions.


Introduction 45
Humans typically experience freely selecting between alternative courses of action, say, when 46 ordering a particular item off a restaurant menu. Yet a series of human studies using 47 electroencephalography (EEG) (Haggard & Eimer, 1999 (Perez et al., 50 2015), and single-cell recordings (Fried, Mukamel, & Kreiman, 2011) challenged the 51 veridicality of this common experience. These studies found neural correlates of decision 52 processes hundreds of milliseconds and even seconds prior to the moment that subjects 53 reported having consciously decided. 54 The seminal research that launched this series of studies was conducted by Benjamin Libet and 55 colleagues (Libet et al., 1983). There, the authors showed that the readiness potential (RP)-a 56 ramp-up in EEG negativity before movement onset, thought to originate from the 57 presupplementary motor area (pre-SMA)-begins before subjects report a conscious decision 58 to act. Libet and colleagues took the RP to be a marker for an unconscious decision to act 59 (Libet et al., 1983;Soon et al., 2008) that, once it begins, ballistically leads to action (Shibasaki 60 & Hallett, 2006). Under that interpretation, the fact that RP onset precedes the report of the 61 onset of the conscious decision to act was taken as evidence that decisions about actions are 62 made unconsciously. And thus the subjective human experience of freely and consciously 63 deciding to act is but an illusion (Harris, 2012;Libet et al., 1983;Wegner, 2002). This finding 64 has been at the center of the free-will debate in neuroscience for almost four decades, 65 captivating scholars from many disciplines in and outside of academia (C. Frith Critically, in the above studies, subjects were told to arbitrarily move their right hand or flex 69 their right wrist; or they were instructed to arbitrarily move either the right or left hand 70 (Haggard, 2008;Hallett, 2016;Roskies, 2010 (Gold 78 & Shadlen, 2007). Yet, interestingly, little has been done in that field to assess the relation 79 between decision-related activity, subjects' conscious experience of deciding, and the neural 80 activity instantaneously contributing to this experience. Though some studies compared, for 81 example, internally driven and externally cued decisions (Thut et al., 2000;Wisniewski, 82 Goschke, & Haynes, 2016), or stimulus-based and intention-based actions (Waszak et al.,83 2005), these were typically arbitrary decisions and actions with no real implications. Therefore, 84 the results of these studies provide no direct evidence about potential differences between 85 arbitrary and deliberate decisions. 86 Such direct comparisons are critical for the free will debate, because it is deliberate, rather than 87 arbitrary, decisions that are at the center of philosophical arguments about free will and moral 88 responsibility (Breitmeyer, 1985; Maoz & Yaffe, 2015; Roskies, 2010). Deliberate decisions 89 The consistency between subjects' choices throughout the main experiment and the NPO 164 ratings they gave prior to the main experimental session was also analyzed using a 2-way 165 ANOVA (see Methods). As expected, subjects were highly consistent with their own, previous 166 ratings when making  subjects were around chance (i.e., 0.5) in their consistency in arbitrary decisions (ranging 176 between 0.39 and 0.64), it seems that some subjects were slightly influenced by their 177 preferences in easy-arbitrary decisions trials, resulting in the significant difference between 178 hard-arbitrary and easy-arbitrary decisions above, though the Bayes factor was inconclusive. 179 Finally, no differences were found between subjects' tendency to press the right vs. left key in 180 the different conditions (both main effects and interaction: F<1). 181 The RP is generally held to index unconscious readiness for upcoming movement (Haggard,198 2008; Kornhuber Methods), we found a prolonged cluster (~1.2s) of activation that reliably differed from 0 in 241 both arbitrary conditions (designated by horizontal blue-shaded lines above the x axis in Fig.  242 3A). The same analysis revealed no clusters of activity differing from zero in either of the 243 deliberate conditions. 244 245 with an expanded timecourse, and stimulus-locked potentials are given in Fig. 6B and 6A, 255 respectively. The same (response-locked) potentials as here, but with a movement-locked 256 baseline of -1 to -0.5 s (same as in our Bayesian analysis), are given in Fig. 6C.

279
Control analyses 280 We further tested whether differences in reaction time between the conditions, eye movements, 281 filtering, and subjects' consistency scores might explain our effect. We also tested whether the 282 RPs might reflect some stimulus-locked potentials or be due to baseline considerations. 283

Differences in reaction times (RT) between conditions, including stimulus-locked potentials 284
and baselines, do not drive the effect 285

Deliberate Arbitrary
RTs in deliberate decisions were typically more than twice as long as RTs in arbitrary 286 decisions. We therefore wanted to rule out the possibility that the absence of RP in deliberate 287 decisions stemmed from the difference in RTs between the conditions. We carried out six 288 analyses for this purpose. First, we ran a median split analysis-dividing the subjects into two 289 groups based on their RTs: lower (faster) and higher (slower) than the median, for deliberate 290 and arbitrary trials, respectively. We then ran the same analysis using only the faster subjects 291 in While the results of the above analyses suggested that our effects do not stem from differences 316 between the RTs in deliberate and arbitrary decisions, the average RTs for fast deliberate 317 subjects were still 660 ms slower than for slow arbitrary subjects. In addition, we had only half 318 of the subjects in each condition due to the median split, raising the concern that some of our 319 null results might have been underpowered. We also wanted to look at the effect of cross-trial 320 variations within subjects and not just cross-subjects ones. We therefore ran a third, within-321 subjects analysis. We combined the two decision difficulties (easy and hard) for each decision 322 type (arbitrary and deliberate) for greater statistical power. And then we took the faster (below-323 median RT) deliberate trials and slower (above-median RT) arbitrary trials for each subject 324 separately. So, this time we had 17 subjects (again, one was removed) and better powered 325 results. Here, fast deliberate arbitrary trials (M=1.63 s, SD=0.25) were just 230 ms slower than 326 slow arbitrary decisions (M=1.40 s, SD=0.45), on average. This cut the difference between fast 327 deliberate and slow arbitrary by about 2/3 from the between-subjects analysis. We then 328 computed the RPs for just these fast deliberate and slow arbitrary trials within each subject 329 (Fig. 5C). Visually, the pattern there is the same as the main analysis (Fig. 3A). What is more, 330 deliberate and arbitrary decisions remained reliably different (t(16)=3.36, p=0.004). Arbitrary 331 trials were again different from 0 (t(16)=-4.40, p=0.0005), while deliberate trials were not 332 (t(16)=-1.54, p=0.14). 333

341
(confidence intervals: dashed red lines). The R 2 is 0.05. One subject, #7, had an RT difference 342 between deliberate and arbitrary decisions that was more than 6 interquartile ranges (IQRs) 343 away from the median difference across all subjects. That same subject's RT difference was 344 also more than 5 IQRs higher than the 75 th percentile across all subjects. That subject was 345 therefore designated an outlier and removed only from this regression analysis.

351
We further regressed the within-subject differences between RPs in fast deliberate and slow 352 arbitrary decisions (defined as above) against the differences between the corresponding RTs 353 for each subject to ascertain that such a correlation would not exist for trials that are closer 354 together. We again found no reliable relation between the two differences ( Fig. 5D Yet another concern that could relate to the RT differences among the conditions is that the RP 357 in arbitrary blocks might actually be some potential evoked by the stimuli (i.e., the 358 presentations of the two causes), specifically in arbitrary blocks, where the RTs are shorter 359 (and thus stimuli-evoked effects could still affect the decision). In particular, a stimulus-evoked 360 potential might just happen to bear some similarity to the RP when plotted locked to response 361 onset. To test this explanation, we ran a fifth analysis, plotting the potentials in all conditions, 362 locked to the onset of the stimulus (Fig. 6A). We also plotted the response-locked potentials 363 across an expanded timecourse for comparison (Fig. 6B). If the RP-like shape we see in Figs. 364 3A and 6B is due to a stimulus-locked potential, we would expect to see the following before 365 the 4 mean response onset times (indicated by vertical lines at 0.98 and 1.00, 2.13, and 2.52 s 366 for arbitrary easy, arbitrary hard, deliberate easy, and deliberate hard, respectively) in the 367 stimulus-locked plot ( Fig. 6A): Consistent potentials, which precede the mean response times, 368 that would further be of a similar shape and magnitude to the RPs found in the decision-locked 369 analysis in the arbitrary condition (though potentially more smeared for stimulus locking). We 370 thus calculated a stimulus-locked version of our ERPs, using the same baseline (Fig. 6A). As 371 the comparison between Fig. 6A and 6B clearly shows, no such consistent potentials were 372 found before the 4 response times, nor were these potentials similar to the RP in either shape or 373 magnitude (their magnitudes are at the most around 1µV, while the RP magnitudes we found 374 are around 2.5 µV; Figs. 3A, 6B). This analysis thus suggests that it is unlikely that a stimulus-375 locked potential drives the RP we found. 376 Notably, the stimulus-locked alignment did imply that the arbitrary easy condition evoked a 377 stronger activity in roughly the last 0.5 s before stimulus onset. However, this prestimulus 378 activity cannot explain the response-locked RP, as it was found only in arbitrary easy trials 379 and not in arbitrary hard trials. At the same time, the response-locked RP did not differ 380 between these conditions. What is more, easy and hard trials were randomly interspersed 381 within deliberate and arbitrary blocks, and the subject discovered the trial difficulty only at 382 stimulus onset. Thus, there could not have been differential preparatory activity that varies 383 with decision difficulty. This divergence in one condition only is accordingly not likely to 384 reflect any preparatory RP activity. 385 One more concern is that the differences in RTs may affect the results in the following manner: 386 Because the main baseline period we used thus far was 1 to 0.5 s before stimulus onset, the 387 duration from the baseline to the decision varied widely between the conditions. To make sure 388 this difference in temporal distance between the baseline period and the response to which the 389 ERPs were locked did not drive our results, we recalculated the potentials for all conditions 390 with a response-locked baseline of -1 to -0.5 s ( Fig. 6C; the same baseline we used for the 391 Bayesian analysis above). The rationale behind this choice of baseline was to have the time 392 that elapsed from baseline to response onset be the same across all conditions. As is evident in 393 Fig. 6C evidence against the claim that the differences in RPs stem from or are affected by the 403 differences in RTs between the conditions. 404 Though ICA was used to remove blink artifacts and saccades (see Methods), we wanted to 414 make sure our results do not stem from differential eye movement patterns between the 415 conditions. We therefore computed a saccade-count metric ( We further investigated potential effects of saccades by running a median-split analysis-422 dividing the trials for each subject into two groups based on their SC score: lower and higher 423 than the median, for deliberate and arbitrary trials, respectively. We then ran the same analysis 424 using only the trials with more saccades in the deliberate condition (SC was 2.02±0.07 and 425 2.04±0.07 for easy and hard, respectively) and those with less saccades for the arbitrary 426 condition (SC was 1.33±0.07 and 1.31±0.08 for easy and hard, respectively). If the number of 427 saccades affects RP amplitudes, we would expect that the differences in RPs between arbitrary 428 and deliberate trials will diminish, or even reverse (as now we had more saccades in the 429 deliberate condition). However, though there were only half the data points for each subject in 430 each condition, a similar pattern of results to those over the whole dataset was observed: 431 Deliberate and arbitrary decisions were still reliably different within the median-split RPs 432 ( The LRP, which reflects activation processes within the motor cortex for action preparation 469 after action selection (Eimer, 1998 window (Eimer, 1998;Haggard & Eimer, 1999). In this purely motor component, no 473 difference was found between the two decision types and conclusive evidence against an effect 474 of decision type was further found ( Fig. 7; all Fs<0.35; BF=0.299). Our analysis of EOG 475 channels suggests that some of that LRP might be driven by eye movements (we repeated the 476 LRP computation on the EOG channels instead of C3 and C4). However, the shape of the eye-477 movement-induced LRP is very different from the LRP we calculated from C3 and C4. Also, 478 the differences that we found between conditions in the EOG LRP are not reflected in the 479 C3/C4 LRP. So, while our LRP might be boosted by eye movements, it is not strictly driven by 480 these eye movements. 481  according to this model, the threshold crossing leading to response onset is largely determined 499 by spontaneous, subthreshold, stochastic fluctuations of the neural activity. This interpretation 500 of the RP challenges its traditional understanding as stemming from specific, unconscious 501 preparation for, or ballistic-like initiation of, movement (Shibasaki & Hallett, 2006). Instead, 502 Schurger and colleagues claimed, the RP is not a cognitive component of motor preparation; it 503 is an artifact of accumulating autocorrelated noise to a hard threshold and then looking at 504 signals only around threshold crossings. 505 We wanted to investigate whether our results could be accommodated within the general 506 framework of the Schurger model, though with the deliberate and arbitrary decisions mediated 507 by two different mechanisms. The first mechanism is involved in value assessment and drives 508 deliberate decisions. It may be subserved by brain regions like the Ventromedial Prefrontal 509 Cortex; VMPFC, (Ramnani & Owen, 2004;Wallis, 2007). But, for the sake of the model, we 510 will remain agnostic about the exact location associated with deliberate decisions and refer to 511 this region as Region X. A second mechanism, possibly at the (pre-)SMA, was held to generate 512 arbitrary decisions driven by random, noise fluctuations. 513 Accordingly, we expanded the model developed by Schurger et al. (2012) in two manners. 514 First, we defined two DDM processes-one devoted to value-assessment (in Region X) and the 515 other to noise-generation (in SMA; see Fig. 8A and Methods). Both of them were run during 516 both decision types, yet the former determined the result of deliberate trials, and the latter 517 determined the results of arbitrary trials. Second, Schurger and colleagues modeled only when 518 subjects would move and not what (which hand) subjects would move. We wanted to account 519 for the fact that, in our experiment, subjects not only decided when to move, but also what to 520 move (either to indicate which NPO they prefer in the deliberate condition, or to generate a 521 meaningless right/left movement in the arbitrary condition). We modeled this by defining two 522 types of movement. One was moving the hand corresponding to the location of the NPO that 523 was rated higher in the first, rating part of the experiment (the congruent option; see Methods). 524 The other was moving the hand corresponding to the location of the lower-rated NPO (the 525 incongruent option). We used the race-to-threshold framework to model the decision process 526 between this pair of leaky, stochastic accumulators, or DDMs. One DDM simulated the 527 process that leads to selecting the congruent option, and the other simulated the process that 528 leads to selecting the incongruent option (see again Fig. 8A). (We preferred the race-to-529 threshold model over a classic DDM with two opposing thresholds because we think it is 530 biologically more plausible (de Lafuente, Jazayeri, & Shadlen, 2015) and because it is easier to 531 see how a ramp-up-like RP might be generated from such a model without requiring a vertical 532 flip of the activity accumulating toward one of the thresholds in each model run.) Hence, in 533 each model run, the two DDMs in Region X and the two in the SMA ran in parallel; the first 534 one to cross the threshold (only in Region X for deliberate decisions and only in the SMA for 535 arbitrary ones) determined decision completion and outcome. Thus, if the DDM corresponding 536 to the congruent (incongruent) option reached the threshold first, the trial ended with selecting 537 the congruent (incongruent) option. For deliberate decisions, the congruent cause had a higher 538 value than the incongruent cause and, accordingly, the DDM associated with the congruent 539 option had a higher drift rate than that of the DDM associated with the incongruent option. For 540 arbitrary decisions, the values of the decision alternatives mattered little and this was reflected 541 in the small differences among the drift rates and in other model parameters (Table 1). 542 Therefore, within this framework, Cz-electrode activity (above the SMA) should mainly reflect 543 the SMA component (Note that we suggest that noise generation might be a key function of the 544 SMA and other brain regions underneath the Cz electrode, at least during this specific task. 545 When subjects make arbitrary decisions, these might be based on some symmetry-breaking 546 mechanism, which is driven by random fluctuations that are here simulated as noise. Thus, we 547 neither claim nor think that noise generation is the main purpose or function of these brain 548 regions in general.) And so, finding that the model-predicted EEG activity resembles the actual 549 EEG pattern we found would imply that our findings are compatible with an account by which 550 the RP represents an artifactual accumulation of stochastic, autocorrelated noise, rather than 551 indexing a genuine marker of an unconscious decision ballistically leading to action. 552  Hence, when the first DDM of the Region X pair would reach the threshold, the decision 575 would be completed, and movement would ensue. At the same time and in contrast, the SMA 576 pair would not integrate toward a decision (Fig. 8B). We modeled this by not including any 577 decision threshold in the SMA in deliberate decisions (i.e., the threshold was set to infinity, 578 letting the DDM accumulate forever). (The corresponding magnitudes of the drift-rate and  579 other parameters are detailed in the Methods and Table 1.) So, when Region X activity reaches 580 the threshold, the SMA (supposedly recorded using electrode Cz) will have happened to 581 accumulate to some random level (Fig. 8B). This entails that, when we align such SMA 582 activity to decision (or movement) onset, we will find just a simple, weak linear trend in the 583 SMA. Importantly, the RP is measured in electrode Cz above the SMA. Hence, we search for it 584 in the SMA (or Noise) Component of our model (and not in Region X). The expected trend in 585 the SMA is the one depicted in red in Fig. 9B for the deliberate easy and hard conditions (here 586 model activity was flipped vertically-from increasing above the x axis to decreasing below 587 it-as in Schurger et al., 2012). In arbitrary decisions, on the other hand, the SMA pair, from 588 which we record, is also the one that determines the outcome. Hence, motion ensues whenever 589 one of the DDMs crosses the threshold. Thus, when its activity is inspected with respect to 590 movement onset, it forms the RP-like shape of Fig. 9B (in blue), in line with the model by 591 Schurger and colleagues (2012). Note that the downward trend for deliberate hard trials is 592 slightly smaller than for deliberate easy (Fig. 9B). While the noise in the empirical EEG 593 signals prohibits reliable statistical differences, the trend in the empirical data is interestingly in 594 the same direction (see the last 500 ms before movement onset in Fig. 3A). 595 Akin to the Schurger model, we simultaneously fit our DDMs to the complete distribution of 596 our empirical reaction-times (RTs; Fig. 9A detailed discussions about the model, its comparison to other models, and the relation to 621 conscious-decision completion). 622  it should not be determined in any way by external factors (Libet, 1985)-which is the case for 648 arbitrary, but not deliberate, decisions (for the latter, each decision alternative is associated 649 with a value, and the values the of alternatives typically guide one's decision). But this notion 650 of freedom faces several obstacles. First, most discussions of free will focus on deliberate 651 decisions, asking when and whether these are free (Frankfurt, 1971;Hobbes, 1994;Wolf, 652 1990). This might be because everyday decisions to which we associate freedom of will-like 653 choosing a more expensive but more environmentally friendly car, helping a friend instead of 654 studying more for a test, donating to charity, and so on-are generally deliberate, in the sense 655 of being reasoned, purposeful, and bearing consequences (although see Deutschländer,Pauen,656 and Haynes (2017)). In particular, the free will debate is often considered in the context of 657 moral responsibility (e.g., was the decision to harm another person free or not) (Fischer, 1999 1994), and free will is even sometimes defined as the capacity that allows one to be morally 660 responsible (Mele, 2006(Mele, , 2009. In contrast, it seems meaningless to assign blame or praise to 661 arbitrary decisions. Thus, though the scientific operationalization of free will has typically 662 focused on arbitrary decisions, the common interpretations of these studies-in neuroscience 663 and across the free will debate-have often alluded to deliberate decisions. This is based on the 664 implicit assumption that the RP studies capture the same, or a sufficiently similar, process as 665 that which occurs in deliberate decisions. And so, inferences from RP results on arbitrary 666 decisions can be made to deliberate decisions. 667 However, here we show that this assumption may not be justified, as the neural precursors of 668 arbitrary decisions, at least in the form of the RP, do not generalize to meaningful, deliberate 669 decisions (Breitmeyer, 1985;Roskies, 2010 different analyses that we conducted-NHST and Bayesian). But, in contrast, we found no 690 evidence for the existence of an RP in deliberate decisions (in all six analyses) and, at the same 691 time, there was evidence against RP existence in such decisions (in five of the six analyses, 692 with the single, remaining analysis providing only inconclusive evidence for an absence of an 693 RP). Therefore, when the above analyses are taken together, we think that the most plausible 694 interpretation of our results is that the RP is absent in deliberate decisions. 695 Nevertheless, even if one takes our results to imply that the RP is only strongly diminished in 696 deliberate compared to arbitrary decisions, this provides evidence against drawing strong 697 conclusions regarding the free-will debate from the Libet and follow-up results. The 698 assumption in the Libet-type studies is that the RP simply reflects motor preparation (Haggard,699 2019; Haggard & Eimer, 1999;Libet, 1985;Libet et al., 1983;Shibasaki & Hallett, 2006) and 700 in that it lives up to its name. However, in our paradigm, both the sensory inputs and the motor 701 outputs were the same between arbitrary and deliberate trials. Thus, motor preparation is 702 expected in both conditions and the RP should have been found in both. Accordingly, any 703 consistent difference in the RP between the decision types suggests-to the very least-that it 704 is a more complex signal than Libet and colleagues had assumed. For one, it shows that it is 705 influenced by cognitive state and that it cannot be regarded as a genuine index of a voluntary 706 decision, be it arbitrary or deliberate. Further, our model predicted an RP in arbitrary decisions 707 but only a slow trend in movement-locked ERP during deliberate decisions that is in the 708 direction as the RP, but is not an RP. Hence, a signal that resembles a strongly diminished RP 709 but is in fact just slow trend in the same direction is congruent with our model. 710 Interestingly, while the RP was present in arbitrary decisions but absent in deliberate ones, the 711 LRP-a long-standing, more-motor ERP component, which began much later than the RP--712 was indistinguishable between the different decision types. This provides evidence that, at the 713 motor level, the neural representation of the deliberate and arbitrary decisions that our subjects 714 made may have been indistinguishable, as was our intention when designing the task. that a common mechanism may underlie both decision types). Possibly then, arbitrary and 729 deliberate decisions may differ not only with respect to the RP, but be subserved by different 730 underlying neural circuits, which makes generalization from one class of decisions to the other 731 more difficult. Deliberate decisions are associated with more lateralized and central neural 732 activity while arbitrary ones are associated with more medial and frontal ones. This appears to 733 align with the different brain regions associated with the two decision types above, as also 734 evidenced by the differences we found between the scalp distributions of arbitrary and 735 deliberate decisions (Fig. 3A). Further studies are needed to explore this potential divergence 736 in the neural regions between the two decision types. 737 Page 22 of 39 Therefore, at the very least, our results support the claim that the previous findings regarding 738 the RP should be confined to arbitrary decisions and do not generalize to deliberate ones. What 739 is more, if the ubiquitous RP does not generalize, it cannot simply be assumed that other 740 markers will. Hence, such differences clearly challenge the generalizability of previous studies 741 focusing on arbitrary decisions to deliberate ones, regardless of whether they were based on the 742 RP or not. In other words, our results put the onus on attempts to generalize markers of 743 upcoming action from arbitrary to deliberate decisions; it is on them now to demonstrate that 744 those markers do indeed generalize. And, given the extent of the claims made and conclusions 745 derived based on the RP in the neuroscience of free will (see again Mele, 2015;Pockett, 746 Banks, & Gallagher, 2009; Sinnott-Armstrong & Nadel, 2011), our findings call for a re-747 examination of some of the basic tenents of the field. 748 It should be noted that our study does not provide positive evidence that consciousness is more 749 involved in deliberate decisions than in arbitrary ones; such a strong claim requires further 750 evidence, perhaps from future research. But our results highlight the need for such research. 751 Under some (strong) assumptions, the onset of the RP before the onset of reported intentions to 752 move may point to there being no role for consciousness in arbitrary decisions. But, even if 753 such conclusions can be reached, they cannot be safely extended to deliberate decisions. 754 To be clear, and following the above, we do not claim that the RP captures all unconscious 755 processes that precede conscious awareness. However, some have suggested that the RP 756 represents unconscious motor-preparatory activity before any kind of decision (e.g., Libet, 757 1985). But our results provide evidence against that claim, as we do not find an RP before 758 deliberate decisions, which also entail motor preparation. What is more, in deliberate decisions 759 in particular, it is likely that there are neural precursors of upcoming actions-possibly 760 involving the above neural circuits as well as circuits that represents values-which are 761 unrelated to the RP (the lack of such precursors is not merely implausible; it implies dualism: 762 Mudrik & Maoz, 2014;Wood, 1985). 763 Note also that we did not attempt to clock subjects' conscious decision to move. Rather, we 764 instructed them to hold their hands above the relevant keyboard keys and press their selected 765 key as soon as they made up their mind. This was to keep the decisions in this task more 766 ecological and because we think that the key method of measuring decision completion (using 767 some type of clock to measure Libet's W-time) is highly problematic (see Methods). But, even 768 more importantly, clock monitoring was demonstrated to have an effect on RP size (Miller et  769 al., 2011), so it could potentially confound our results . 770 Some might also claim that unconscious decision-making could explain our results, suggesting 771 that in arbitrary decisions subjects engage in unconscious deliberation or in actively inhibiting 772 their urge to follow their preference as well as in free choice, while in deliberate decisions only 773 deliberation is required. But this interpretation is unlikely because the longer RTs in deliberate 774 decisions suggest, if anything, that more complex mental processes (conscious or unconscious) 775 took place before deliberate and not arbitrary decisions. In addition, these interpretations 776 should impede our chances of finding the RP in arbitrary trials (as the design diverges from the 777 original Libet task), yet the RP was present, rendering them less plausible. 778 Aside from highlighting the neural differences between arbitrary and deliberate decisions, this 779 study also challenges a common interpretation of the function of the RP. If the RP is not 780 present before deliberate action, it does not seem to be a necessary link in the general causal 781 chain leading to action. Schurger et al. (2012) suggested that the RP reflects the accumulation 782 of autocorrelated, stochastic fluctuations in neural activity that lead to action, following a 783 threshold crossing, when humans arbitrarily decide to move. According to that model, the 784 shape of the RP results from the manner in which it is computed from autocorrelated EEG: 785 averaged over trials that are locked to response onset (that directly follows the threshold 786 crossing). Our results and our model are in line with that interpretation and expand it to 787 decisions that include both when and which hand to move. They suggest that the RP represents 788 the accumulation of noisy, random fluctuations that drive arbitrary decisions, Whereas 789 deliberate decisions are mainly driven by the values associated with the decision alternatives 790 (Maoz et al., 2013 to find a way to break the symmetry between the two possible actions. If so, the RP in the 829 arbitrary decisions might actually reflect the extra effort in those types of decisions, which is 830 not found in deliberate decisions. However, this interpretation entails a longer reaction time for 831 arbitrary than for deliberate decisions, because of the heavier cognitive load, which is the 832 opposite of what we found ( Fig. 2A). 833 In conclusion, our study suggests that RPs do not precede deliberate decisions (or at the very 834 least are strongly diminished before such decisions). In addition, it suggests that RPs represent 835 an artificial accumulation of random fluctuations rather than serving as a genuine marker of an 836 unconscious decision to initiate voluntary movement. Hence, our results challenge RP-based 837 claims of Libet and follow-up literature against free will in arbitrary decisions and much more 838 so the generalization of these claims to deliberate decisions. The neural differences we found 839 between arbitrary and deliberate decisions as well as our model further put the onus on any 840 study trying to draw conclusions about the free-will debate from arbitrary decisions to 841 demonstrate that these conclusions generalize to deliberate ones. This motivates future 842 investigations into other precursors of action besides the RP using EEG, fMRI, or other 843 techniques. It also highlights that it would be of particular interest to find the neural activity 844 that precedes deliberate decisions as well as neural activity, which is not motor activity, that is 845 common to both deliberate and arbitrary decisions. 846

Materials and Methods 847
Subjects 848 Twenty healthy subjects participated in the study. They were California Institute of 849 Technology (Caltech) students as well as members of the Pasadena community. All subjects 850 had reported normal or corrected-to-normal sight and no psychiatric or neurological history. 851 They volunteered to participate in the study for payment ($20 per hour). Subjects were 852 prescreened to include only participants who were socially involved and active in the 853 community (based on the strength of their support of social causes, past volunteer work, past 854 donations to social causes, and tendency to vote). The data from 18 subjects was analyzed; two 855 subjects were excluded from our analysis (see Sample size and exclusion criteria below). The 856 experiment was approved by Caltech's Institutional Review Board (14-0432; Neural markers 857 of deliberate and random decisions), and informed consent was obtained from all participants 858 after the experimental procedures were explained to them. 859

Sample size and exclusion criteria 860
We ran a power analysis based on the findings of Haggard and Eimer (1999). Their RP in a 861 free left/right-choice task had a mean of 5.293 µV and standard deviation of 2.267 µV. Data 862 from a pilot study we ran before this experiment suggested that we might obtain smaller RP 863 values in our task (they referenced to the tip of the nose and we to the average of all channels, 864 which typically results in a smaller RP). Therefore, we conservatively estimated the magnitude 865 of our RP as half of that of Haggard & Eimer, 2.647 µV, while keeping the standard deviation 866 the same at 2.267 µV. Our power analysis therefore suggested that we would need at least 16 867 subjects to reliably find a difference between an RP and a null RP (0 µV) at a p-value of 0.05 868 and power of 0.99. This number agreed with our pilot study, where we found that a sample size 869 of at least 16 subjects resulted in a clear, averaged RP. Following the above reasoning, we 870 decided beforehand to collect 20 subjects for this study, taking into account that some could be 871 excluded as they would not meet the following predefined inclusion criteria: at least 30 trials 872 per experimental condition remaining after artifact rejection; and averaged RTs (across 873 conditions) that deviated by less than 3 standard deviations from the group mean. 874 Subjects were informed about the overall number of subjects that would participate in the 875 experiment when the NPO lottery was explained to them (see below). So, we had to finalize 876 the overall number of subjects who would participate in the study-but not necessarily the 877 overall number of subjects whose data would be part of the analysis-before the experiment 878 began. After completing data collection, we ran only the EEG preprocessing and behavioral-879 data analysis to test each subject against the exclusion criteria. This was done before we looked 880 at the data with respect to our hypothesis or research question. Two subjects did not meet the 881 inclusion criteria: the data of one subject (#18) suffered from poor signal quality, resulting in 882 less than 30 trials remaining after artifact rejection; another subject (#12) had RTs longer than 883 3 standard deviations from the mean. All analyses were thus run on the 18 remaining subjects. 884

Stimuli and apparatus 885
Subjects sat in a dimly lit room. The stimuli were presented on a 21" Viewsonic G225f (20" 886 viewable) CRT monitor with a 60-Hz refresh rate and a 1024×768 resolution using 887 Psychtoolbox version 3 and Mathworks Matlab 2014b (Brainard, 1997;Pelli, 1997 descriptions, representing two actual NPOs, were presented in each trial (Fig. 1). In deliberate 921 blocks, subjects were instructed to choose the NPO to which they would like to donate $1000 922 by pressing the <Q> or <P> key on the keyboard, using their left and right index finger, for the 923 NPO on the left or right, respectively, as soon as they decided. Subjects were informed that at 924 the end of each block one of the NPOs they chose would be randomly selected to advance to a 925 lottery. Then, at the end of the experiment, the lottery will take place and the winning NPO 926 will receive a $20 donation. In addition, that NPO will advance to the final, inter-subject 927 lottery, where one subject's NPO will be picked randomly for a $1000 donation. It was 928 stressed that the donations were real and that no deception was used in the experiment. To 929 persuade the subjects that the donations were real, we presented a signed commitment to 930 donate the money, and promised to send them the donation receipts after the experiment. Thus, 931 subjects knew that in deliberate trials, every choice they made was not hypothetical, and could 932 potentially lead to an actual $1020 donation to their chosen NPO. 933 Arbitrary trials were identical to deliberate trials except for the following crucial differences. 934 Subjects were told that, at the end of each block, the pair of NPOs in one randomly selected 935 trial would advance to the lottery together. And, if that pair wins the lottery, both NPOs would 936 receive $10 (each). Further, the NPO pair that would win the inter-subject lottery would 937 receive a $500 donation each. Hence it was stressed to the subjects that there was no reason for 938 them to prefer one NPO over the other in arbitrary blocks, as both NPOs would receive the 939 same donation regardless of their button press. Subjects were told to therefore simply press 940 either <Q> or <P> as soon as they decided to do so. 941 Thus, while subjects' decisions in the deliberate blocks were meaningful and consequential, 942 their decisions in the arbitrary blocks had no impact on the final donations that were made. In 943 these trials, subjects were further urged not to let their preferred NPO dictate their response. 944 Importantly, despite the difference in decision type between deliberate and arbitrary blocks, the 945 instructions for carrying out the decisions were identical: Subjects were instructed to report 946 their decisions as soon as they made them in both conditions. They were further asked to place 947 their left and right index fingers on the response keys, so they could respond as quickly as 948 possible. Note that we did not ask subjects to report their "W-time" (time of consciously 949 reaching a decision), because this measure was shown to rely on neural processes occurring 950 after response onset (Lau, Rogers, & Passingham, 2007) and to potentially be backward 951 inferred from movement time (Banks & Isham, 2009 choose the cause to which you want to donate $1000") or in blue (Arbitrary: "In this block 961 both causes may each get a $500 donation regardless of the choice") on a gray background that 962 was used throughout the experiment. Short-hand instructions appeared at the top of the screen 963 throughout the block in the same colors as that block's initial instructions; Deliberate: "Choose 964 for $1000" or Arbitrary: "Press for $500 each" (Fig. 1). 965 Each trial started with the gray screen that was blank except for a centered, black fixation 966 cross. The fixation screen was on for a duration drawn from a uniform distribution between 1 967 and 1.5 s. Then, the two causes appeared on the left and right side of the fixation cross 968 (left/right assignments were randomly counterbalanced) and remained on the screen until the 969 subjects reported their decisions with a key press-<Q> or <P> on the keyboard for the cause 970 on the left or right, respectively. The cause corresponding to the pressed button then turned 971 white for 1 s, and a new trial started immediately. If subjects did not respond within 20 s, they 972 received an error message and were informed that, if this trial would be selected for the lottery, 973 no NPO would receive a donation. However, this did not happen for any subject on any trial. 974 To assess the consistency of subjects' decisions during the main experiment with their ratings 975 in the first part of the experiment, subjects' choices were coded in the following way: each 976 binary choice in the main experiment was given a consistency grade of 1, if subjects chose the 977 NPO that was rated higher in the rating session, and 0 if not. Then an averaged consistency 978 grade for each subject was calculated as the mean consistency grade over all the choices. Thus, 979 a consistency grade of 1 indicates perfect consistency with one's ratings across all trials, 0 is 980 perfect inconsistency, and 0.5 is chance performance. 981 We wanted to make sure subjects were carefully reading and remembering the causes also 982 during the arbitrary trials. This was part of an effort to better equate-as much as possible, 983 given the inherent difference between the conditions-memory load, attention, and other 984 cognitive aspects between deliberate and arbitrary decisions-except those aspects directly 985 associated with the decision type, which was the focus of our investigation. We therefore 986 randomly interspersed 36 memory catch-trials throughout the experiment (thus more than one 987 catch trial could occur per block). On such trials, four succinct descriptions of causes were 988 presented, and subjects had to select the one that appeared in the previous trial. A correct or 989 incorrect response added or subtracted 50 cents from their total, respectively. (Subjects were  990 informed that, if they reached a negative balance, no money will be deducted off their payment 991 for participation in the experiment.) Thus, subjects could earn $18 more for the experiment, if 992 they answered all memory test questions correctly. Subjects typically did well on these 993 memory questions, on average erring in 2. The EEG was recorded using an Active 2 system (BioSemi, the Netherlands) from 64 Due to anatomical differences between subjects, variation in the positioning of the electrode cap, and the fact that our EEG caps came in three discrete sizes, it is unlikely that any given electrode will be optimally placed to record the RP in all subjects. Most subjects exhibited an RP at electrode Cz and one or more adjacent electrodes, especially contralateral to the dominant hand (used to perform the task), but the center of the spatial distribution varied from subject to subject. Therefore, for each subject we selected an electrode from Cz, C1, or FC1 (Cz, C2, or FC2 if left handed) on the basis of data from the classic task, showing the highest-amplitude RP. This same electrode was then used for analysis of the data from the interruptus task (so the choice of electrode used in Fig. 3 was independent of the data presented in Fig. 3). Limiting the choice to C1 (C2) or FC1 (FC2) did not change the outcome.
Model and Simulations. All simulations were performed using MatLab (MathWorks). The model includes two components: a leaky stochastic accumulator (with a threshold on its output) and a time-locking/epoching procedure. We used a well-known accumulator model (DDM) (27), which is an extension of an earlier model developed by Ratcliff (23). Simulation of the model amounts to iterative numerical integration of the differential equation where I is drift rate, k is leak (exponential decay in x), ξ is Gaussian noise, and c is a noise-scaling factor (we used c = 0.1). Δt is the discrete time step used in the simulation (we used Δt = 0.001). In the context of our model, I corresponds to a general (and we assume constant) urgency to respond that is inherent in the demand characteristics of the task. A small amount of urgency is necessary in the model to account for the fact that subjects rarely if ever wait longer than ∼20 s to produce a movement in any given trial. Because of the leak term, the urgency does not set up a linear trajectory toward the threshold (i.e., if we were to increase the threshold that we used by a factor of 2, the output of the accumulator would essentially never reach it), but simply moves the baseline level of activity closer to the threshold so that a crossing is very likely to happen soon (Fig. 1, Inset). Thus, the model has three free parameters, urgency (I), leak (k), and threshold (β). The threshold was expressed as a percentile of the output amplitude over a set of 1,000 simulated trials (50,000 time steps each). These three parameters were chosen on the basis of the best fit of the first crossing-time distribution to the empirical waiting-time distribution from the classic Libet task (we use the term "waiting time" instead of "reaction time"). The parameters were then fixed at these values for all other simulations and analyses, including the fitting of the RP. The three parameter values assigned