How much does emotional valence of action outcomes affect temporal binding?

Temporal binding refers to the compression of the perceived time interval between voluntary actions and their sensory consequences. Research suggests that the emotional content of an action outcome can modulate the effects of temporal binding. We attempted to conceptually replicate these findings using a time interval estimation task and different emotionally-valenced action outcomes (Experiments 1 and 2) than used in previous research. Contrary to previous findings, we found no evidence that temporal binding was affected by the emotional valence of action outcomes. After validating our stimuli for equivalence of perceived emotional valence and arousal (Experiment 3), in Experiment 4 we directly replicated Yoshie and Haggard's (2013) original experiment using sound vocalizations as action outcomes and failed to detect a significant effect of emotion on temporal binding. These studies suggest that the emotional valence of action outcomes exerts little influence on temporal binding. The potential implications of these findings are discussed.


Introduction
Temporal binding refers to the compression of the perceived time interval between voluntary actions and their sensory consequences (Haggard, Clark, & Kalogeras, 2002). More specifically, an outcome (e.g., a tone) is experienced earlier when it is triggered by a voluntary action compared to when it occurs in isolation or is triggered by an involuntary movement. Similarly, actions that trigger an event are experienced later than actions with no discernible outcome (see Moore & Obhi, 2012, for a review). For example, Haggard et al. (2002) examined judgements of the onset time of both a voluntary action and a resulting tone using the Libet clock method (Libet, Gleason, Wright, & Pearl, 1983), where one estimates the time of onset of an action or outcome via the position of a rotating clock-hand around a clock-face. These judgements were compared to those made when only the action was performed (i.e., with no outcome) and when a sound was heard in isolation (i.e., without a prior cause). Haggard et al. found that the perceived time of an action was later when the action produced a tone compared to when there was no outcome. Moreover, the perceived time of a sound was earlier when the sound had been produced by an action compared to when it was heard in isolation. In other words, temporal binding means that the time interval between an action and its outcome becomes perceptually compressed when we think there is a causal relationship between action and outcome. Temporal binding has also been observed with methods other than the Libet task, such as verbal or numerical estimates of the interval between action and outcome (Buehner & Humphreys, 2009;Humphreys & Buehner, 2010). Temporal binding has been shown to occur for both self-and other-generated actions (Moore, Teufel, Subramaniam, Davis, & Fletcher, 2013;Poonian & Cunnington, 2013) and may be a general phenomenon linking causally related events (Buehner, 2012).
To date, researchers have mostly investigated the conditions required for temporal binding and the mechanisms that underpin it (Hughes, Desantis, & Waszak, 2013), and they have done so using experimental tasks that often involve basic actions, such as a button press, producing sensory feedback, such as an auditory tone (David, Newen, & Vogeley, 2008;Sato & Yasuda, 2005). These temporal binding tasks arguably lack any real-world complexity with which humans perform goal-directed actions to produce meaningful outcomes in everyday life (Moretto, Walsh, & Haggard, 2011). Researchers have started to examine the generalizability of temporal binding effects to stimuli beyond simple and arbitrary outcomes, such as priming social cues (Aarts et al., 2012), authorship of action cues (Desantis, Weiss, Schütz-Bosbach, & Waszak, 2012), leaderfollower cues (Pfister, Obhi, Rieger, & Wenke, 2015) and economic and pain cues (Caspar, Christensen, Cleeremans, & Haggard, 2016). For example, Aarts et al. (2012) found that, when primed with a positive picture (taken from the International Affective Picture System; Lang, Bradley, & Cuthbert, 1999) that indicated a reward, temporal binding during the Libet clock task increased compared to neutral primes. Takahata et al. (2012) trained participants to associate two tones with either financial gain or loss. Using the Libet task, they found that the temporal interval between judgements of onsets for actions and outcomes of financial loss was significantly larger than for judgements of financial gain. In other words, negative outcomes reduced the effect of temporal binding. This points towards the possibility that the effect of valence on temporal binding might be driven by self-serving biases, where one is more inclined to associate positive events with the self compared to negative events (Mezulis, Abramson, Hyde, & Hankin, 2004;Miller & Ross, 1975). Yoshie and Haggard (2013) directly tested this idea by investigating whether temporal binding differed between outcomes that varied in terms of their intrinsic emotionality. They asked participants to make voluntary actions (a keypress) that produced auditory sounds that were either of positive or negative emotional vocalizations (e.g., laughter or disgust). Participants made temporal estimations of their actions and the ensuing sound via the Libet clock method. They found that positive sounds produced shorter estimations of onset-time between the action and sound compared to negative sounds (Experiment 1), with this effect being mostly driven by decreased binding to negative outcomes (Experiment 2). Yoshie and Haggard's (2013) research provided promising evidence that negative emotional outcomes reduce temporal binding, which occurs presumably because people are less inclined to attribute negative outcomes to themselves. However, despite the potential importance of Yoshie and Haggard's (2013) findings, they have yet to be replicated using other temporal binding tasks and different emotionally-valenced action outcomes. Thus, answering Christensen, Yoshie, Di Costa, and Haggard's (2016) call for more research exploring the emotional modulation of temporal binding using alternative methods, the goal of the current research was to conceptually replicate Yoshie and Haggard's (2013) temporal binding effects using an interval estimation procedure (vs. the Libet task; Moore & Obhi, 2012) and images of faces conveying positive and negative emotions (vs. emotional vocalizations; experiments 1 and 2). Moreover, we conducted a separate study to validate the perceived valence of the face stimuli we used in Experiments 1 and 2 (Experiment 3), and we conducted a highly-powered direct replication of Yoshie and Haggard's first experiment (Experiment 4). On the basis of Yoshie and Haggard's findings, we expected that temporal binding would be smaller for negative outcomes (faces or vocalizations conveying negative emotions) than for positive outcomes (faces or vocalizations conveying positive emotions).

Experiment 1
We used an interval estimation procedure to gauge temporal binding (Ebert & Wegner, 2010;Engbert, Wohlschläger, & Haggard, 2008;Moore, Wegner, & Haggard, 2009). In this procedure, participants are asked to judge the time interval between an action and its sensory outcome (e.g., a button press and a sound). Using this procedure, Engbert et al. (2008) found that the interval between voluntary actions and visual, auditory, and somatic outcomes were compressed compared to the interval between passive actions and similar outcomes. For our task, participants were asked to press the space bar, which was followed by emotionally valenced action-outcomes-namely, emoticons depicting positive, neutral, or negative emotions (see Fig. 1). Emoticons are prevalent throughout modern technological communication, and frequently used to convey emotion (Derks, Bos, & Von Grumbkow, 2008;Hudson et al., 2015). Research has shown that emoticons elicit similar cortical responses to real faces (Churches, Nicholls, Thiessen, Kohler, & Keage, 2014) and that emotions conveyed in emoticons are subject to similar behavioural biases (Öhman, Lundqvist, & Esteves, 2001) and neural processing disruptions (Jolij & Lamme, 2005) as real faces.

Participants
We recruited 80 native English-speaking participants (51 males, M age = 33.91, SD age = 11.27) through prolific.ac, an online crowdsourcing platform. Participants received monetary compensation. We screened participants for the following inclusion criteria: an approval rating of above 90% on prolific.ac (based on prior experiment performance/approval scores) and aged between 18 and 65. The required sample size was fixed ahead of data collection, and a power analysis showed we had 90% power to detect a small effect (Cohen's f = 0.10) of emotional valence on temporal binding (a = 0.05).

Materials and procedures
Experiment 1 consisted of 100 trials: 10 practice and 90 experimental trials. We used an interval estimation procedure to measure temporal binding (see Moore & Obhi, 2012). For each trial, participants saw a fixation cross on the screen, and in their own time, pressed the spacebar. In the practice block participant actions produced a neutral stimulus, which was a green circle with a diameter equal to the emoticon images. During practice trials, the green circle appeared after a randomly selected time interval from either 0 ms or a multiple of 100 ms up to 900 ms. We used all intervals in the practice block, to encourage participants to expect the full range of durations in the experimental block. During the practice block, feedback was provided to participants after they made their time estimations. Feedback consisted of both the participant's estimated time and the actual time of stimulus onset to enhance familiarity with estimating time in milliseconds. In the experimental condition, an emoticon appeared after either 100, 400 or 700 ms (Moore et al., 2009), which remained on the screen for a further 400 ms. We varied the delay intervals to increase participants' uncertainty regarding the interval between action and outcome to allow for variation in judgement times (cf. Ebert & Wegner, 2010). The emotional expressions of the emoticons were manipulated by orienting the lines representing the mouth: curved upwards for positive, curved downwards for negative, and a straight line for neutral. The emoticons were genderless, varied only in the shape of the mouth, and were presented on a white background in the center of the screen (see Fig. 1).
Participants underwent two blocks of 45 trials, allowing for 30 presentations of each emoticon image in total. Participants were instructed that they would not receive feedback for their time estimations during the experimental trials. A schematic display of the sequence of trial events is shown in Fig. 2.
Both the time intervals and emoticons (either positive, negative or neutral) were pseudo-randomised across trials, such that there was the same number of trials in each condition at each time interval. A blank screen then followed the emoticon for 400 ms, replaced by a horizontal time estimation scale in the center of the screen (see Fig. 3). The scale ranged from 0-1000 ms, with demarcation lines every 100 ms. Participants were instructed to scroll the slider along the bar to the time that they believed it took the image to appear since their action (in multiples of 100 ms). Once selected, participants confirmed their selections by clicking on a 'finish' button, and proceeded to the next trial.

Discussion
In Experiment 1, the emotional valence of action outcomes did not affect temporal binding. One potential limitation of Experiment 1 is that although previous research has shown that emoticons can have the same affective consequences as real faces do (Öhman et al., 2001), the emoticons we used might not have elicited enough of an emotional response to modulate temporal binding. Thus, rather than using emoticons for action outcomes, in Experiment 2 we replicated our Experiment 1 procedure using images of real human faces expressing either negative or positive emotions.

Experiment 2
In Experiment 2, we used real-face images as the outcomes to participants' actions. Real face images have been welldocumented to elicit electrocortical responses, and emotional expressions are typically rated along the dimensions of valence and arousal : Smith, Weinberg, Moran, and Hajcak (2013), using the NimStim collection of face-images (NimStim, Tottenham et al., 2009), found that emotional expressions (e.g., happy, fearful, sad), elicited greater cortical responses than neutral face images. Generally, both negative and positive emotions invoke stronger emotional responses than faces with neutral expressions (Ito, Cacioppo, & Lang, 1998), however the current literature suggests negative emotions elicit stronger cortical responses than positive emotions (Leppänen, Kauppinen, Peltola, & Hietanen, 2007;Smith, Cacioppo, Larsen, & Chartrand, 2003).

Participants
We recruited 89 participants (55 males: M age = 33.73, SD age = 10.74) through prolific.ac.uk. An additional participant was excluded due to a technical problem. Participants received monetary compensation. A power analysis showed that we had 95% power to detect a small effect (Cohen's f = 0.10) of emotional valence on temporal binding (a = 0.05).

Materials and procedures
Experiment 2 consisted of 110 trials: 30 practice trials, and 80 experimental trials. To prepare participants for the experimental procedure, we asked participants to initially perform a practice task consisting of 10 trials where their actions produced a neutral stimulus (the green circle). Similar to Experiment 1, during practice trials the time interval for the stimuli to appear was randomly selected from either 0 ms, or a multiple of 100 ms, up to 900 ms. Participants were provided with feedback about the accuracy of their time estimations.
Outcome stimuli consisted of 80 face images of young adults either portraying positive or negative expressions, taken from a widely used and validated set of face stimuli (NimStim, Tottenham et al., 2009). The facial images were balanced for gender, such that 10 males and 10 females were randomly chosen from the set (see Fig. 5). Four facial images per male/female were chosen: two depicting positive facial emotions, and two depicting negative facial emotions (80 images in total, 4 Â 20). The positive facial emotions included 40 images of a happy expression comprised the positive facial emotions, and 36 images of disgust and 4 images of fear expressions for the negative. Images were presented on a white background in the center of the screen. For the initial practice trials, we used the same neutral stimulus (green circle) as Experiment 1. Participants underwent two experimental task blocks of 40 trials each, with a break between blocks. Each block was dedicated to either solely positive expressions or negative expressions, and the order of task blocks was counterbalanced between participants. Therefore, action-effects were predictable within their own blocks. Furthermore, participants were instructed that they would not receive feedback for their time estimations. The time interval for face images to appear was randomised at 100 ms, 400 ms, or 700 ms (Moore et al., 2009), with the same number of trials in each condition at each time interval. A practice block of 10 trials that contained stimuli of the related task block preceded each experimental block. Upon block completion, participants were instructed that they would be asked to complete another practice task where they would see a different set of images, receiving feedback with their time estimations.
To incentivize participant to attend to the face stimuli, we also implemented catch-trials by informing participants that they would also be occasionally asked a question about the image they had just seen (specifically, ''Was the previous face male or female?"). If they were correct, then they would be awarded an extra 10 pence per correct question. There were six catch trials in total -three trials per experimental condition. Seventy-six participants (84%) scored correctly on all catch trials, 8 participants (9%) scored correctly on 5 catch trials, and the remaining 6 participants scored correctly on 4 catch trials.

Discussion
Similar to Experiment 1, the findings from our second experiment indicated no modulation of negative versus positive emotions on temporal binding. This is despite the use of real facial images depicting emotional expressions (as opposed to emoticons), and the predictability of which emotion-expression (either positive or negative) would result from the participant's action.
For both Experiments 1 and 2, we failed to find any meaningful effect of emotion on temporal binding, which seems inconsistent with earlier findings. One potential issue with our first two experiments, however, is that the stimuli we used for the positive and negative action outcomes (emoticons and real faces) might be perceived as less positively and/or negatively valenced than the sound vocalizations that Yoshie and Haggard (2013) used and therefore produced weaker temporal binding effects. To validate our stimuli, in Experiment 3 participants rated the emotional valence and arousal of the emoticons and faces we used in Experiments 1 and 2 and the positive and negative sound vocalizations that Yoshie and Haggard used.

Participants
Forty-nine participants were recruited via Amazon's Mechanical Turk (25 males, M age = 34.80, SD age = 11.56). To ensure data independence, one additional participant was not included in the analyses because they had a duplicate IP address.

Materials and procedures
Participants were informed that they would rate several images of faces and sound vocalizations in terms of how negative-to-positive and emotional arousing they appeared or sounded, respectively. Participants first performed a sound check that asked them to identify three different sounds (e.g., a cow mooing) from three choices (e.g., a pig's oink, a cow's moo, or a chicken's cluck) in order to ensure participants both could hear the sounds properly and were paying attention. All respondents saw the emoticons used in Experiment 1, all 80-face expressions used in Experiment 2, and heard 24 sounds (three repetitions of the 8 different sounds). The sounds were the same as those used by Yoshie and Haggard (2013), which were a selection of 8 different non-verbal emotional vocalizations: four negative vocalizations (screams expressing fear or retches expressing disgust, each with both male and female voices) and four positive vocalizations (cheers expressing achievement or laughs expressing amusement, each with both male and female voices). The block order of which type of stimulus the participants rated was randomly determined, and the stimuli presented within those blocks was randomised. Using the same rating scales as Yoshie and Haggard, after seeing/hearing the stimulus, participants judged the extent to which each stimulus looked (for the images) or sounded (for the vocalizations) negative-to-positive, on a 7-point scale ranging from 1 (highly negative) to 7 (highly positive). Participants also rated the extent to which they believed each stimulus sounded or looked emotionally arousing (1 = not arousing at all to 7 = highly arousing).

Results
Ratings of valence and emotional arousal were averaged across the different positive and negative faces and sounds. Because we were primarily interested in determining whether the different stimuli were perceived to be of equivalent valence, we conducted a one-way ANOVA with stimulus type on three levels (emoticons, faces, and vocalizations) separately for positive and negative stimuli. Shown in Table 1, there was a significant main effect of stimulus type in terms of perceived valence for both positive stimuli, F(2, 96) = 15.21, p < 0.001, gp 2 = 0.24, and negative stimuli F(2, 96) = 22.44, p < 0.001, gp 2 = 0.32. Paired sample t-tests revealed that the happy emoticon was rated as significantly more positive than the positive vocalizations, t(48) = 3.52, p = 0.001; there was no significant mean difference between the positive faces and positive vocalizations in terms of perceived valence, t(48) = 1.95, p = 0.057. For the negative stimuli, the negative vocalizations were rated as more positive (less negative) than both the sad emoticon, t(48) = 5.76, p < 0.001, and the negative faces, t(48) = 2.70, p = 0.01. Thus, the emoticon and face stimuli we used in Experiments 1 and 2 were perceived as either the same or more emotionally-valenced than the sound vocalizations used by Yoshie and Haggard (2013).
We also conducted a one-way ANOVA with stimulus type on three levels (emoticons, faces, and vocalizations) separately for positive and negative stimuli for perceived emotional arousal. There were no significant differences among the types of positive stimuli for the ratings of emotional arousal, F(2, 96) = 1.44, p = 0.24, gp 2 = 0.03. For the negative stimuli, F(2, 96) = 3.70, p = 0.028, gp 2 = 0.07, the negative vocalizations were rated as more arousing than the sad emoticon, t(48) = 2.31, p = 0.025, but were no more arousing than the negative faces, t(48) = 0.79, p = 0.43.

Discussion
The findings of Experiment 3 indicate that the visual stimuli used within Experiments 1 and 2 and the audio stimuli of Yoshie and Haggard (2013) were by and large rated similarly across dimensions of perceived valence and emotional arousal. More specifically, the positive emoticon was rated as more positive and more emotionally arousing than those of real faces and emotionally valenced vocalizations. Similarly, the negative emoticons and the negative faces were rated as more negative than the vocalizations. As such, the failure to find the predicted modulation of temporal binding by emotion in Experiments 1 and 2 does not seem to be driven by differences in the emotional appraisal of the stimuli.

Experiment 4
Because we did not find an effect of emotion on temporal binding in Experiments 1 and 2, we conducted a direct replication of Yoshie and Haggard (2013) to investigate the replicability of their findings. Note. Means that do not share a common subscript across rows are significantly different (p < 0.05).

Participants
We recruited 24 participants to achieve 95% power to detect Yoshie and Haggard's reported effect size for their Experiment 1 (dz = 0.77): 12 males and 12 females (aged 18-23: M age = 21.75, SD age = 3.11), one for each of the 8 (2 Â 2 Â 2) possible orders of conditions (agency/baseline, action/sound and positive/negative vocalizations), counterbalanced between participants. Participants were paid for their time. Following Yoshie and Haggard (2013), we screened for the following exclusion criteria: native language other than English, left handedness, recent use of illicit drugs, uncorrected visual or auditory impairment, and history of psychiatric or neurological illness.

Materials and procedures
Experiment 4 used the exact same auditory stimuli as Yoshie and Haggard (2013). The stimuli were a selection of nonverbal emotional vocalizations, previously validated in the native English population to significantly differ in perceived valence, but not in perceived arousal (Sauter, Eisner, Calder, & Scott, 2010). In the negative condition, each participant's keypress was followed by one of four negative vocalizations (screams expressing fear or retches expressing disgust). In the positive condition, these were replaced by positive vocalizations (cheers expressing achievement or laughs expressing amusement). The auditory stimuli in each condition were carefully matched for pitch (peak frequency) and duration.
This experiment faithfully replicated the same procedure used by Yoshie and Haggard (2013). We presented the experiment via Macintosh computers (OS X 10.9.5), and used a customised program running in Inquisit v4.01 (Draine, 1998;Millisecond Software) to present participants with the temporal binding task on a 27-inch flat screen. We used the Libet clock task to measure the perceived timing of actions and sounds. During the experiment, participants viewed a Libet clock. In agency conditions, the participant was instructed to press a key on a computer keyboard with the right index finger at a time of his/her choosing, which caused a sound to appear 250 ms later. The participant was then prompted to report where the clock hand was at the onset of their key-press or (agency action condition), in a separate block, at the onset of the sound (agency sound condition). In the single-event baseline action condition, the participant pressed a key at a time of his/her choosing. This keypress did not cause a sound, and the participant was asked to judge the time of his/her keypress. In the single-event baseline sound condition, the participant heard sounds at random intervals, which mimicked time intervals of participant key-presses, and judged the times of sound onsets. To make sure that participants understood the task, we asked participants to perform 5 practice trials before each condition.
Participants underwent four task blocks of 32 trials each (baseline action, baseline sound, agency action, and agency sound) for both the negative and positive conditions, or 256 (32 trials Â 8 blocks) trials in total. In each block four different sounds of an emotional condition were presented in a randomised order (4 sounds Â 8 repetitions). Since each block contained only positive or negative sounds, the four different vocalizations consisted of either the disgust and fear sounds, or the achievement and amusement sounds (each in both male and female voices). Each block was further divided into two sub-blocks of 16 trials each, with the stimuli randomised across the two sub-blocks, such that each sub-block could contain an uneven distribution of sounds. To ensure attention to the auditory stimuli, at the end of every sub-block we asked participants which of the four sounds they heard most frequently during that sub-block. Participants gained a reward of 25 pence for each correct answer to this question. The whole experiment was divided into two sessions of four blocks each. Each session was devoted to action judgments (baseline action and agency action) or sound judgments (baseline sound and agency sound) only. Half of participants (n = 8) judged the times of action in the first session and of sound in the second session, while in the other half (n = 8) the order was reversed. A 10-min break was inserted between the two sessions. To maximize the effects of emotional valence, within each session the baseline and agency blocks of one emotional condition (e.g., negative) were presented successively, and after a 5-min break the blocks of another emotional condition (e.g., positive). Thus, there was an additional 5-min break within each session. Both the order of emotional conditions (negative first or positive first) and the order of task types (baseline first or agency first) were consistent across the two sessions for each participant, and counterbalanced between participants (see Yoshie & Haggard, 2013).

Results
We used Yoshie and Haggard's (2013) protocol for extracting binding scores. Judgement errors were calculated individually for each block by subtracting the actual onset of the event with the perceived onset. Positive values reflect a delayed judgement, and negative outcomes reflect an anticipatory (early) judgement. Action binding (shift) was calculated by subtracting the mean judgement error of the action in the baseline condition from the mean judgement error in the agency condition. Similarly, sound binding (shift) was calculated by subtracting the mean judgement error of the sound in the baseline condition from the mean judgement of the sound in the agency condition. Composite binding was calculated by subtracting the mean shift in sound judgements from the mean shift in action judgements. Per Yoshie and Haggard (2013), paired t-tests (negative vs. positive) were used to assess the effects of emotional valence on temporal binding. We performed a Grubbs test for outliers (Grubbs, 1950), and no participant met the criteria for exclusion (all ps > 0.05). Additionally, we compared scores between positive and negative vocalizations on an attention task asking participants to state the most frequent sound within the preceding sub-block. A paired-samples t-test revealed no difference in participants' attention to sounds between negative (M = 3.83, SD = 1.34) and positive (M = 3.63, SD = 1.21) vocalizations, t(23) = 0.96, p = 0.35; dz = 0.20. Table 2 shows the mean judgment errors and shifts relative to baseline conditions for different emotional conditions. The presence of action binding was confirmed by a shift in judgement errors that was significantly different from zero for action judgements in both the negative, t(23) = 2.94, p = 0.007, dz = 0.60, and positive conditions, t(23) = 3.78, p = 0.001, dz = 0.77. Similarly, sound binding was also significant for both negative, t(23) = 5.47, p < 0.001, dz = 1.12, and positive vocalizations, t(23) = 5.27, p < 0.001, dz = 1.08. Composite binding did not differ significantly between the negative (M = À234.68, SD = 174.71) and positive conditions (M = À280.38, SD = 134.20), t(23) = 1.20, p = 0.24, dz = 0.24. Similarly, paired t-tests revealed no significant difference in sound binding, t(23) = 0.64, p = 0.53; dz = 0.13, or action binding, t(23) = 1.16, p = 0.26, dz = 0.24, between the positive and negative conditions.

Discussion
The findings of Experiment 4 suggest that temporal binding, as measured using the Libet clock method, was not significantly modulated by positive versus negative sound vocalizations as action outcomes. It is worth noting that although we did not find significant modulation of temporal binding by emotional valence, the effect we observed was nonetheless in the same direction as Yoshie and Haggard's (2013) effect. Thus, if there is an effect of emotional valence on temporal binding using the Libet task and sound vocalizations, it is smaller than previously thought. Moreover, given the results of our Experiments 1 and 2, the effect of emotional valence of action outcomes on temporal binding does not seem to generalize using emotionally valenced visual stimuli and time interval estimation tasks.

General discussion
The objective of this series of experiments was to investigate the degree to which temporal binding is modulated by emotional valence. Studies 1 and 2 found no significant difference in temporal binding between positive and negative emoticons (Study 1) or positive and negative real facial expressions (Study 2). Study 3 revealed that the stimuli used in Studies 1 and 2 were equivalent in valence and arousal to stimuli that have previously been observed to modulate temporal binding (Yoshie & Haggard, 2013). Furthermore, in a highly powered replication study (Study 4), we observed no significant modulation of temporal binding by emotionally valenced vocalizations (Yoshie & Haggard, 2013). Taken together, these finding cast doubt on whether temporal binding is influenced by outcome valence.
Despite showing no significant modulation by valence, temporal binding itself was clearly present in Study 4. Indeed the binding scores were overall somewhat larger than Yoshie and Haggard's (2013). This suggests that the absence of a valence effect in our study was not due to reduced sensitivity to detect emotional modulation. Although not significant, the effect of valence on binding was in the predicted direction in the current study. However, it is worth noting that this was largely driven by greater action binding to positive tones, whereas Yoshie and Haggard's (2013) effect was more strongly localised on outcome binding. More recently Christensen et al. (2016) investigated the effect of outcome valence on prospective and retrospective components of action binding (see Moore & Obhi, 2012) with the same vocalizations used here and in Yoshie and Haggard (2013). They observed significantly increased retrospective action binding only when the valence of the outcome was unpredictable. However, for predictable outcomes (as used in the current study) there was reduced action binding for both positive and negative outcomes compared to neutral outcomes. Taken together with the current findings, a complex picture emerges whereby the precise effect of emotion on temporal binding cannot be clearly attributed to a simple selfserving bias such that positive outcomes increase binding. This may reflect a genuine complexity in the precise mechanisms driving the emotional modulation of binding, or it might reflect the fact that the underlying effect is small or unreliable. The absence of an effect of valence in Experiments 1 and 2 suggest that any effect, if present in the population, does not generalize to other measures of binding. Future work should attempt to replicate and extend other examples of self-serving bias in temporal binding (Aarts et al., 2012;Takahata et al., 2012) and sensory attenuation (Gentsch, Weiss, Spengler, Synofzik, & Schütz-Bosbach, 2015;Hughes, 2015) to further advance our understanding of how (or if) outcome valence influences implicit agency.
Assessing the degree to which binding is modulated by factors that also modulate explicit agency reports is important to determine the relationship between implicit and explicit agency. Recent evidence suggestions that neither sensory attenuation (Dewey & Knoblich, 2014) nor temporal binding (Dewey & Knoblich, 2014;Saito, Takahata, Murai, & Takahashi, 2015) correlate with explicit reports of agency. While explicit and implicit measures will never show total convergence, positive evidence of covariation is important to argue that conscious reports and unconscious biases are indeed measuring the same underlying process. The current studies provide new evidence that questions the degree to which temporal binding is modulated by self-serving biases.