Me or we? Action-outcome learning in synchronous joint action

Goal-directed behaviour requires mental representations that encode instrumental relationships between actions and their outcomes. The present study investigated how people acquire representations of joint actions where co-actors perform synchronized action contributions to produce joint outcomes in the environment. Adapting an experimental procedure to assess individual action-outcome learning, we tested whether co-acting individuals link jointly produced action outcomes to individual-level features of their own action contributions or to group-level features of their joint action instead. In a learning phase, pairs of participants produced musical chords by synchronizing individual key press responses. In a subsequent test phase, the previously produced chords were presented as imperative stimuli requiring forced-choice responses by both pair members. Stimulus-response mappings were systematically manipulated to be either compatible or incompatible with the individual and joint action-outcome mappings of the preceding learning phase. Only joint but not individual compatibility was found to modulate participants' performance in the test phase. Yet, opposite to predictions of associative accounts of action-outcome learning, jointly incompatible mappings between learning and test phase resulted in better performance. We discuss a possible explanation of this finding, proposing that pairs' group-level learning experience modulated how participants encoded ambiguous task instructions in the test phase. Our findings inform current debates about mechanistic explanations of action-outcome learning effects and provide novel evidence that joint action is supported by dedicated mental representations encoding own and others' actions on a group level.


Introduction
Joint actions, such as playing a piano duet, moving furniture together or greeting a friend with a fist bump, require people to purposefully coordinate their actions in the service of shared goals (Butterfill, 2018;Pacherie, 2013;Sebanz, Bekkering, & Knoblich, 2006).Previous research indicates that successful performance of joint actions directed at shared goals hinges on dedicated mental representations specifying the desired outcomes of the joint action and the joint contributions of self and others that are needed to achieve them (Knoblich, Butterfill, & Sebanz, 2011;Vesper, Butterfill, Knoblich, & Sebanz, 2010; see Sebanz & Knoblich, 2021 for a recent review).Yet, little is known about how people acquire these representations and what learning mechanisms underlie their formation.To fill this gap, the current study investigated how co-actors learn and represent novel instrumental relationships between their coordinated actions and the joint outcomes they produce together.

Action-outcome learning
Producing goal-directed actions, either alone or together with others, presupposes knowledge about instrumental relationships between actions and their resulting outcomes on the body and the environment (Dickinson & Balleine, 1994;Hommel, 2017;Wolpert & Flanagan, 2001).A parsimonious way to explain how people acquire such knowledge implicitly through sensorimotor experience is provided by associative accounts of goal-directed behaviour (de Wit & Dickinson, 2009), most prominently by ideomotor theories (Greenwald, 1970;Hommel, Müsseler, Aschersleben, & Prinz, 2001;Prinz, 1997; see Pfister, 2019 andShin, Proctor, &Capaldi, 2010 for reviews).Ideomotor theories propose that the capacity to produce goal-directed behaviour relies on bidirectional action-outcome associations that are acquired through the recurrent experience of contingent relationships between actions and their outcomes.Once established, bidirectional actionoutcome associations provide a simple mechanistic explanation of how actions become causally initiated and controlled: By activating perceptual representations of action outcomes, either through internal mental anticipation or through external perception in the environment, their associated motor programs become directly activated and prepared for execution.Thus, ideomotor theories propose that actions are initiated and controlled through representations of their associated outcomes.
Evidence that bidirectional action-outcome associations underly learning and control of individual motor behaviour comes from studies employing a well-established two-stage action-outcome learning task (cf.Elsner & Hommel, 2001).The task involves an initial learning phase, in which people experience novel action-outcome relationships by performing simple actions (e.g., left-and right-handed button presses) that contingently produce arbitrary outcomes in the environment (e.g., low and high tones).Whether people acquired bidirectional action-outcome associations during the learning phase is then probed in a subsequent test phase in which the former action outcomes are presented as imperative stimuli, requiring participants to perform either free-or forced-choice responses.Numerous studies have demonstrated that the presentation of former action outcomes in the test phase primes execution of those responses that produced these outcomes in the preceding learning phase (Elsner & Hommel, 2001, 2004;Herwig, Prinz, & Waszak, 2007;Hoffmann, Lenhard, Sebald, & Pfister, 2009;Janczyk, Giesen, Moeller, Dignath, & Pfister, 2022;Pfister, Kiesel, & Hoffmann, 2011;Sun, Custers, Marien, & Aarts, 2020;Watson, Van Steenbergen, De Wit, Wiers, & Hommel, 2015).These findings support the idea that the recurrent experience of contingent relationships between actions and their outcomes leads to the acquisition of bidirectional action-outcome associations.
Bidirectional action-outcome associations thus provide a mechanistic explanation of how people acquire mental representations that enable the production of individual goal-directed actions.In the current study, we investigated how action-outcome learning takes place during joint actions and could lead to the acquisition of dedicated joint action representations guiding interpersonal action coordination in the service of shared goals.

Representing joint actions
Many joint actions involve deliberate coordination between actions of multiple individuals that are collectively directed at the production of joint outcomes in the environment (Butterfill, 2012;Pacherie, 2013;Sebanz et al., 2006).Thus, in contrast to individual actions, joint actions seem to require that individuals learn and represent how their own actions together with those of their co-actors produce outcomes in the environment that reflect the joint effects of their coordinated action contributions.
These representational peculiarities of joint action are underlined by a range of empirical findings.By investigating how pairs of pianists monitor individual and joint performance during musical duets, Loehr, Kourtis, Vesper, Sebanz, and Knoblich (2013) showed that co-actors are especially sensitive to performance errors altering outcomes that depend on their common contributions to the joint action (e.g., the harmony of jointly produced chords), while individual performance errors that leave these joint action outcomes unaffected are processed with lower priority.Furthermore, Loehr and Vesper (2016) showed that when people learn to play simple piano melodies as part of a musical duet performed with an accompanist, it is more difficult for them to produce the learned melody alone without the accompanying auditory feedback of their counterpart.Together, these studies indicate that actions performed in joint action contexts are predominantly guided by representations of joint rather than individual action outcomes.
This evokes the question of how co-actors represent instrumental relationships between their actions and the joint outcomes they produce together.One theoretical possibility proposed by Vesper et al. (2010) is that people engaging in joint actions rely on minimal joint action representations that merely specify the outcome of the joint action and an individual's own action contributions that are needed to achieve it, while only being aware that the outcome cannot be achieved by acting alonecaptured by the formula "ME + X".Thus, according to this proposal, individuals engaged in joint actions may link the production of joint action outcomes only to their own contributions to the joint action while their co-actors' contributions would not need to be specified in any detail.
While these minimal representational requirements capture a range of scenarios that qualify as genuine cases of goal-directed joint action (cf., Knoblich et al., 2011;Vesper et al., 2010), many empirical findings indicate that people often form more elaborate representations of their joint actions that also specify their co-actors' contributions to the joint action in more detail.Studies on shared task representations show that people tend to co-represent specific aspects of others' tasks and actions when acting alongside each other, which is indexed by modulations of people's individual task performance by specifics of their co-actors' tasks and actions.For example, people's individual task performance has been shown to be modulated by simple stimulus-response rules (Atmaca, Sebanz, Prinz, & Knoblich, 2008;Sebanz, Knoblich, & Prinz, 2003, 2005) and action-outcome mappings (Pfister, Dolk, Prinz, & Kunde, 2014;Sacheli, Arcangeli, & Paulesu, 2018;Sacheli, Musco, Zazzera, & Paulesu, 2021) of their co-actors as well as by more elaborated aspects of others' tasks, such as their physical task constrains (Schmitz, Vesper, Sebanz, & Knoblich, 2017) or the order of their actions (Schmitz, Vesper, Sebanz, & Knoblich, 2018).
These findings imply that joint action representations can encode not only an individual's own contributions to the joint action but also those of their co-actors.Yet, the assumption of shared task representations leaves open the question of how co-actors integrate information about their own and their partners' action contributions into unified representations of their joint action performance (Butterfill, 2015;Keller, Novembre, & Loehr, 2016;Knoblich & Jordan, 2003;Pesquita, Whitwell, & Enns, 2018;Sebanz & Knoblich, 2009;Sinigaglia & Butterfill, 2022).
An interesting possibility is that co-actors form joint action representations that specify foremost what they are pursuing together as a group (i.e., as a "WE") rather than as separate interacting individuals (i.e., as "ME + YOU") (Butterfill, 2015;Della Gatta et al., 2017;Gallotti & Frith, 2013;Kourtis, Woźniak, Sebanz, & Knoblich, 2019;Pacherie, 2013;Tsai, Sebanz, & Knoblich, 2011).Thus, instead of specifying the individual action contributions of self and others separately and in parallel, joint action representations may primarily encode how actions are to be performed by the group as a whole.
Ample empirical support for the idea that co-actors form group-level representations of their joint actions is now provided by several lines of research.Studies investigating rationality principles of higher-level action planning in joint action contexts have shown that co-actors raise individual effort to maximize action efficiency at the level of the group (Török, Pomiechowska, Csibra, & Sebanz, 2019;Török, Stanciu, Sebanz, & Csibra, 2020).Studies investigating how co-actors' experience agency in joint action contexts have demonstrated that people's judgments of control over joint actions are strongly affected by group-level task performance (Dewey, Pacherie, & Knoblich, 2014;Loehr, 2018) and seem to reflect a sense of joint rather than self-agency (Bolt & Loehr, 2017;Bolt, Poncelet, Schultz, & Loehr, 2016).
Further evidence for group-level action representations has been provided by studies on action mimicry.These studies showed that actions performed in synchrony or in turns with a partner are facilitated when co-actors observe the same actions performed by another dyad compared to when observing only individual parts of the joint action performed by a single actor (Ramenzoni, Sebanz, & Knoblich, 2014;Tsai et al., 2011).Thus, performing actions as part of a joint action benefits more from observing another group modeling the joint action than from observing a single actor modeling only individual parts of it.
Lastly, studies on interpersonal coordination indicate that joint action representations can specify not only each co-actor's separate contributions to a joint action but also how the individual contributions of self and others relate to each other at the level of the group.This is supported by findings showing that interference effects between observed and executed actions of two co-actors become modulated when both actions are performed as interrelated contributions towards a shared goal (Clarke et al., 2019;Della Gatta et al., 2017;Sacheli et al., 2018).Furthermore, a recent study by Kourtis et al. (2019) showed that co-actors' action initiation and coordination performance benefits from prior information about pending joint actions that merely specifies relations between co-actors' upcoming individual action contributions (e. g., whether co-actors will perform similar or different actions).
Taken together, the reviewed evidence for group-level action representations opens up the possibility that co-actors may link the production of joint action outcomes directly to group-level relations between their individual contributions to the joint action.

The present study
The purpose of the present study was to examine how people acquire joint action representations by investigating how co-actors come to represent novel instrumental relationships between their coordinated actions and the joint outcomes they produce together.To that end, we assessed how the recurrent experience of contingent relationships between joint actions and their resulting outcomes affects action-outcome learning in jointly acting individuals.
Based on the previous literature reviewed above, we contrasted two theoretical alternatives of how co-actors might link their coordinated actions to the joint outcomes they produce together.A first possibility is that action-outcome learning during joint action is merely sensitive to an individual's own contributions to the joint action.Thus, individuals engaged in a joint action may link the production of joint action outcomes merely to their own action contributions but not to the contributions of their co-actors.This would indicate that action-outcome learning during joint action leads to the implicit formation of minimal joint action representations, merely specifying the joint action outcome and an individual's own contribution to it.
Alternatively, action-outcome learning during joint action may instead be sensitive to group-level relations between co-actors' individual contributions to the joint action.According to this possibility, individuals engaged in a joint action may link the production of joint action outcomes directly to spatial and/or temporal relations between their own and their co-actors' contributions to the joint action that emerge at the level of their group-level performance.This would indicate that action-outcome learning during joint action leads to the implicit formation of group-level representations of the joint action.See Fig. 1 for a visual illustration of the two theoretical alternatives.
To contrast these alternatives, we set out to investigate actionoutcome learning in synchronous joint actions that require multiple coactors to synchronize their individual action contributions with one another to achieve a desired outcome in the environment.Synchronous joint actions provide a specifically interesting test case for actionoutcome learning as they introduce contrasting predictions of how jointly acting individuals could represent instrumental relationships between their actions and perceived outcomes: When co-actors perform synchronized actions to produce a joint outcome (e.g., performing synchronized key strokes on a piano to produce a harmonic chord), each coactor could attribute the perceived outcome to individual-level features of their own action contribution alone (e.g., I pressed a certain key with my left/right index finger), but also to relational group-level features of their own and their partners' action contributions taken together (e.g., WE pressed a certain configuration of keys with similar/different fingers).As such, synchronous joint actions should allow us to test whether action-outcome learning in joint action contexts leads to the formation of minimal or group-level representations of the joint action.
Therefore, we adapted the individual action-outcome learning task by Elsner and Hommel (2001) to a synchronous joint action setting.In an initial learning phase, two co-actors produced a series of low and high two-tone chords by means of synchronized key presses on a joint response key layout with four horizontally aligned keys.In a subsequent test phase, both co-actors were instructed to respond to the former chord outcomes with respect to a stimulus-response mapping that was manipulated to either preserve or reverse the action-outcome relationships of the previous learning phase, with respect to both individuallevel features of each co-actor's isolated responses as well as to relational group-level features of co-actors joint response configurations (see Fig. 2).According to the minimal account, action-outcome learning should be only sensitive to individual-level features of co-actors' isolated response contributions.Following associative accounts of actionoutcome learning, this should be reflected in a performance advantage in the test phase when the instructed stimulus-response mapping preserves the action-outcome relationship of the preceding learning phase on the level of each individual co-actor.In contrast, according to the group-level account, action-outcome learning should be sensitive to relational features of co-actors' joint response configurations.Following associative accounts of action-outcome learning, this should be reflected in a performance advantage in the test phase when the instructed stimulus-response mapping preserves the action-outcome relationship of the preceding learning phase on the level of the group.

Participants
In total, eighty adult participants took part in the experiment, grouped into pairs that were randomly composed upon study sign-up.Four pairs were dropped from analysis meeting preregistered exclusion criteria so that the final sample included seventy-two participants (25 male, 47 female, M Age = 25.8,SD Age = 4.5) grouped into thirty-six pairs (17 same gender pairs, 19 mixed gender pairs). 1 Recruitment took place through an online research participation system of Central European University (CEU) in Vienna, Austria.All participants gave 1 Previous studies deploying variants of our task in the domain of individual action observed medium to large effect sizes for individual compatibility manipulations in the test phase (Eder & Dignath, 2017;Elsner & Hommel, 2001, Exp. 1;Hoffmann et al., 2009, Exp. 1;Hommel, Alonso, & Fuentes, 2003;Wolfensteller & Ruge, 2011).A sensitivity analysis in G*Power (Faul, Erdfelder, Lang, & Buchner, 2007) showed that the final sample size of N = 72 would have been sufficient to detect main effects of size d = 0.67 with 80% power at an alpha level of 0.05 (two-sided) in a 2 × 2 between-subjects factorial design.Thus, our final design and sample size should have been sufficiently powered to detect and replicate an individual compatibility effect, if the effect persists in joint action contexts.We take this as a reasonable starting point for justifying our sample size decision, as our preregistered predictions were targeted at main effects in our final 2 × 2 between-subjects factorial design.
written informed consent and were compensated with 10 Euro for participation.Ethical approval for the study was granted by by the Psychological Research Ethics Board of CEU.

Apparatus and stimuli
The experiment was run in a quiet, well-lit room.Stimulus presentation and response recording was controlled by a custom-made script written in PsychoPy (Peirce et al., 2019), running on a Dell computer attached to a 24 in.LCD monitor with a refresh rate of 60 Hz.Each participant responded using an individual response box (The Black Box ToolKit; dimensions: 202 mm × 137 mm × 35 mm LWH) with four horizontally aligned response buttons.Only the two outer buttons of the response boxes were used for the experiment.If not specified differently, written instructions and text stimuli were presented in white letters against black background on the computer monitor at a viewing distance of approximately 60 cm.Auditory stimuli comprised four synthesized organ notes differing in pitch (C 4 , G 4 , C 5 , G 5 ) 2 and were presented via stereo speakers (Genius SP-HF 180) placed to the left and right of the computer monitor at a volume of approximately 60 dB.

Procedure
The experiment was divided into a learning and a test phase and lasted about 60 min.The experimenter was present throughout the whole session and monitored the procedure from outside of the participants' view.

Learning phase
The learning phase comprised two parts.The first part was performed by each pair member alone while the other waited outside the laboratory room (solo part).The second part was performed by both pair members together (joint part).
Solo part.For the solo part of the learning phase, participants were seated centrally at the long side of a table facing the computer monitor with a single response box (labelled with "A" or "B" respectively) placed in front of them.One pair member received the "A"-labelled box (referred to as Participant A), the other the "B"-labelled box (referred to as Participant B).Participants were instructed to produce a series of high and low tones by pressing the left and the right button on their response box with the index finger of their left and right hand respectively.The mapping between responses and tones was instructed by an illustration of the response box highlighting the respective buttons and labelling them with "high" and "low" respectively (see Fig. 3A).Within pairs, participants always received the reversed mapping compared to their partner (e.g., Participant A: left ➔ high tone, right ➔ low tone; Participant B: left ➔ low tone, right ➔ high tone; cf.Fig. 2).The mappings were counterbalanced across pairs.One pair member produced C-notes (low tone: C 4 , high tone: C 5 ) while the other pair member produced G-notes (low tone: G 4 , high tone: G 5 ).Thus, for both pair members the pitch difference between low and high tones was one octave.
At the beginning of each trial, participants were instructed to decide which tone to produce next by pressing the assigned button on their response box.Participants could choose freely but were instructed to produce a balanced amount of high and low tones throughout the experimental phase.After indicating their decision, participants' choice was centrally displayed as a text prompt (e.g., "high tone") for 1000 ms.Then, a counter appeared on screen, counting in an interval of 500 ms from three down to a "GO!" prompt that remained on screen for 500 ms.Participants were instructed to issue their respective response in synchrony with the onset of the "GO!" prompt.If participants responded in Fig. 2. Experimental set-up and design.Panel A shows the learning phase.In the solo part of the learning phase (left), individual participants produced low and high tones by pressing the left (L) or right (R) button on their individual response box.In the joint part of the learning phase (right), pairs of participants performed synchronous responses to produce low-and high-pitched two-tone chords together.At the group level, co-actors produced the two chords by performing synchronized responses to the inner (marked in blue) or outer buttons (marked in orange) of their joint response button layout.At the individual level, each co-actor contributed to the chord outcomes by performing a response to the left or right button on their individual response box.Thus, the jointly produced chords could be represented as being contingent on the relational response configurations co-actors performed together (i.e., WE press the inner/outer buttons) or as being related to an individual participant's own response contribution (I press my left/right button) or both.Panel B shows the test phase.Pairs of participants responded in parallel to the previously produced chords according to a prescribed stimulus-response mapping.On the group level, the stimulus-response mapping was manipulated to either preserve (Jointly Compatible mapping) or reverse (Jointly Incompatible mapping) the action-outcome mapping of the previous learning phase with respect to co-actors' joint response configurations (inner/outer).The stimulus-response mapping was also manipulated at the individual level, so that it either preserved (Individually Compatible mapping) or reversed (Individually Incompatible mapping) the action-outcome mapping of the previous learning phase regarding each co-actor's individual response contributions (left/right).To manipulate joint and individual compatibility of the stimulus-response mappings orthogonally to each other, co-actors were instructed to switch their seating positions from learning to test phase in two of the four test phase conditions (indicated by the red arrows).(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 2 Organ tones were chosen due to their sharp attack and decay.All tones were synthesized with the digital audio workstation software LMMS (https://lmms.io/) using open-source sound fonts.line with their indicated decision and if their response fell within a response window of 500 ms around the onset of the "GO!" prompt, the respective tone was played for a duration of 600 ms.If participants' responses fell outside an additional asynchrony window of 250 ms around the onset of the "GO!" prompt, a feedback message reminding them to respond in synchrony with the onset of the "GO!" prompt was presented on screen after the tone had played.If participants responded not in line with their indicated decision or gave no response within the response window, no tone was played, and an error message was displayed in the centre of the screen for 1500 ms.Throughout each trial, a visual illustration of participants' response box was displayed at the bottom of the computer screen, mirroring participants' responses.As soon as participants produced a response within the response window, the respective response button in the display was highlighted until the end of the trial.The next trial started after a blank screen, displayed for 500 ms.Break messages appeared after every ten successful trials (trials in which participants produced a tone) and informed participants about the ratio of high and low tones produced so far.The solo part of the learning phase ended after each pair member had performed forty successful trials.
Joint part.For the joint part of the learning phase, the two pair members were seated side by side at the long side of the table, both facing the centrally placed computer monitor.Participant A sat on the left side, Participant B sat on the right.Participants' individual response boxes were placed between markers on the table in front of them, 20 cm apart from each other.They were instructed to produce high and low two-tone chords together by each pressing one of their response buttons in synchrony with their partner.The target chords were two organ chords made up from one C-and one G-note and differed in their relative pitch by means of one octave (lower chord: C 4 G 4 , higher chord: C 5 G 5 ).The mappings between participants' individual responses and tone outcomes remained the same as in the solo part of the learning phase, so that one pair member was responsible for playing the C-and the other was responsible for playing the G-note of a chord.Consequently, producing the target chords required pair members to produce different spatial response configurations together: Pairs jointly responded to the inner two buttons (requiring Participant A to press their right and Participant B to press their left button) or the outer buttons (requiring Participant A to press their left and Participant B to press their right button) of their joint response button layout (see Fig. 2).The mappings between participants' responses and chord outcomes were instructed by an illustration, showing the response boxes of both pair members side by side, highlighting the relevant response buttons, and labelling them with "low" and "high" respectively (see Fig. 3B).The rationale behind using this instruction was to leave it to the participants whether to code their responses at the group level (inner/outer) or at an individual level (left/ right).
The trial timeline for the joint part of the learning phase is illustrated in Fig. 4A.At the beginning of each trial, pairs were instructed to jointly decide which chord to produce next.Therefore, one pair member was randomly assigned to verbally propose the next chord to their partner who could either approve the proposal or propose the other alternative instead.The assignment was instructed by means of the prompts "Propose!" and "Approve!"printed on the side of the screen corresponding to participants' seating positions and was balanced across trials.Pairs were instructed to produce a roughly balanced amount of low and high chords throughout the experimental phase.As soon as participants found an agreement, the experimenter registered their choice and proceeded the trial manually. 3First, pairs' choice was centrally displayed as a text prompt on screen (e.g., "high chord") for 1000 ms.Then, a counter appeared on screen, counting in an interval of 500 ms from three down to a "GO!" prompt that remained on screen for 500 ms.To produce the chords, participants had to respond as synchronously as possible at the onset of the "GO!" prompt. 4The response window opened 500 ms before the onset of the "GO!" prompt and closed 500 ms after.The first registered response by one of the pair members that fell within this response window, and that matched the required response configuration, triggered the tone assigned to that response.If the second pair member's response also fell within the response window and within an asynchrony window of ms with respect to the first pair member's response, the tone mapped to the second response was triggered too.The respective two-tone chord resulting from both tones played simultaneously was then played for a duration of 1000 ms.
If one pair member produced a response that did not match the required response configuration, the trial was ended, either stopping a previously triggered tone of their partner or preventing registration of any further response.In this case, an error message ("Wrong response!") was displayed after the offset of the "GO!" prompt at the side of the participant who had performed the error.If one pair member did not Fig. 3. Illustrations used to instruct participants about their action-outcome mappings in the solo part (A) and in the joint part (B) of the learning phase.In the test phase, stimulus-response mappings were instructed with a picture corresponding to panel B.
3 Initial proposals were approved by the second pair member in over 99% of all trials.
4 Using a countdown in the learning phase had the aim to facilitate synchronized responding of both partners by making the onset of the GO! prompt predictable.This procedural aspect of the learning phase deviates from common procedures of the two-stage action-outcome learning task deployed in previous studies.An anonymous reviewer suggested that the synchrony instruction in the learning phase could have led participants to respond to a ms delay after stimulus onset in the subsequent test phase because they were trained to do so in the learning phase.We deem this possibility rather unlikely because, in contrast to the GO! prompt in the learning phase, both timing and type of the test phase stimuli were randomized (see details below).Furthermore, as the timing demands in the learning and the test phase were identical across experimental conditions, they shouldn't be able to serve as an alternative explanation of predicted performance differences between experimental conditions.
produce a response within the response window, previously triggered tones of their partner were stopped, and an error message ("Missing response!") was displayed at the side of the participant who had failed to respond.If no response was registered within the response window, the error message was displayed on both sides of the screen.If both pair members produced correct responses within the response window but failed to meet the asynchrony demand between their responses, the firstly triggered tone was stopped immediately and the feedback message "Be more synchronous!"appeared centrally on screen.Feedback and error messages were displayed for 1500 ms.
Throughout each trial, an illustration of the two response boxes displayed side by side was shown at the bottom of the computer screen, mirroring participants' responses.As soon as a pair member produced a response within the response window, the respective response button in the display was highlighted until the end of the trial.The next trial started after a blank screen, displayed for 500 ms.There was as short break after every twenty successful trials (trials in which pairs produced a chord together) in which participants were informed about the ratio of high and low chords produced so far.The joint part of the learning phase ended if pairs had performed eighty successful trials.

Test phase
The test phase started directly after the joint part of the learning phase with no break in between.Depending on the experimental condition, pairs either remained in their current seating position or were instructed to swap seats, each of them taking their individual response box with them to the new seating position (see Fig. 2).Pairs were then instructed to respond to the auditory presentation of the same high and low two-tone chords they had produced in the preceding learning phase as quickly and as accurately as possible in accordance with a fixed stimulus-response mapping displayed on screen. 5The stimulus-response mapping was instructed through an illustration similar to the illustration used in the joint part of the learning phase.It depicted the two response boxes of both pair members side by side, highlighting the respective response buttons, and labelling them with "high" and "low" respectively (see Fig. 3B).Given this instruction participants could construe their responses at the group-level (inner/outer), at an individual level (left/ right) or both.
Furthermore, pairs were instructed to withhold any response when chords were presented together with the display of a red "X" on the computer screen (no-go trials). 6Depending on the experimental condition, the stimulus-response mappings required pairs to respond either with the same inner/outer response configuration they had performed together to produce the two chords in the preceding learning phase (Jointly Compatible mapping) or not (Jointly Incompatible mapping).At the same time, the stimulus-response mapping required each individual participant to respond either with the same left/right response they had performed individually to produce the two chords in the preceding learning phase (Individually Compatible mapping) or not (Individually Incompatible mapping) (see Fig. 2).The trial timeline for the test phase is illustrated in Fig. 4B.Each trial started after a variable inter trial interval that ranged between 500 and 1500 ms with the presentation of the higher or the lower chord, played for a duration of 1000 ms.The chord was presented together with a white exclamation mark (go trials) or a red "X" (no-go trials) displayed centrally on screen.The response window was open for the duration of the chord.Participants received error feedback for incorrect or missing responses displayed for 1500 ms on the side of the pair member who had produced the error.Throughout each trial, an illustration of the two response boxes displayed side by side was shown at the bottom of the computer screen, mirroring participants responses.As soon as a pair member produced a response within the response window, the respective response button in the display was highlighted until the end of the trial.
Pairs performed 50 trials that were divided into 5 sub-blocks à 10 trials.Within each sub-block the number of high and low chords was balanced, and two trials (one high chord and one low chord trial) were no-go trials.Trial order within each sub-block was randomized.The first sub-block served as training after which the instruction slide depicting the instructed stimulus-response mapping was shown again.The remaining trials proceeded with no breaks in between.

Design
All pairs performed two test phase blocks, one time with a jointly compatible and one time with a jointly incompatible stimulus-response mapping.Before performing the second test phase block in the remaining experimental condition, pairs repeated the joint part of the learning phase for a second time in the same seating positions as before.The order of the two test-phase blocks was counterbalanced across pairs.Individual Compatibility of the stimulus-response mappings was manipulated between pairs, so that one half of the pairs performed both test phase blocks with an Individually Compatible and the other half of the pairs with an Individually Incompatible stimulus-response mapping.
As specified in our preregistration, we eventually limited our design on pairs' first test phase block only, as preliminary analysis of participants test phase performance with Joint Compatibility as a withinsubjects factor revealed significant block order effects (see Appendix B and C).This left the study with a 2 (Individual Compatibility: Individually Compatible vs. Individually Incompatible) x 2 (Joint Compatibility: Jointly Compatible vs. Jointly Incompatible) fully betweensubjects factorial design.

Data analysis 2.5.1. Learning phase
For the solo part of the learning phase, trials with response omissions were removed prior to analysis (4.4% of all trials).From the remaining trials, error rates (ER LearnSolo , relative frequency of trials in which participants failed to produce a tone) and the ratio of high and low tones produced on successful trials were aggregated for each participant.For the joint part of the learning phase, trials in which both pair members omitted responses were removed prior to analysis (0.4% of all trials).From the remaining trials, error rates (ER LearnJoint , relative frequency of trials in which pairs failed to produce a chord together) 7 and the ratio of high and low chords produced on successful trials were aggregated for each pair.
To assess the possibility of potential differences between participants in the four test phase conditions regarding their learning phase performance, error rates and outcome ratios in the solo and joint part of the learning phase were analysed as a function of the experimental conditions participants performed in the subsequent test phase.The ratio of high tones produced in the solo part of the learning phase and the ratio of high chords produced in the joint part of the learning phase were compared against chance by means of one-sampled t-tests, separately for each of the four test phase groups resulting from the factorial design.Error rates in the solo and in the joint part of the learning phase were compared between the four test phase conditions by means of an ANOVA with Individual Compatibility (Individually Compatible vs. Individually Incompatible) and Joint Compatibility (Jointly Compatible vs. Jointly Incompatible) as between-subjects factors.

Test phase
The first ten trials of the test phase served as training trials to familiarize participants with the trial procedure and were therefore excluded from analysis.For the remaining test trials, go-and no-go trials were separated and response omissions on go-trials were removed (3,9% of all go-trials).For no-go trials, the frequency of erroneous responses was calculated for each participant.For go-trials, error rates (ER Test , relative frequency of wrong responses) and mean response times (RTs) on correct trials were calculated for each participant.As specified in our preregistration, we also calculated a combined measure of participants response performance on go-trials accounting for speed-accuracy tradeoff by combining error rates and mean RTs into inverse efficiency scores (IES), calculated as mean RT/(1-ER test ) (Bruyer & Brysbaert, 2011).Furthermore, asynchronies between valid go-trial responses of both pair members were aggregated for each pair by calculating pairs' mean absolute response asynchronies (|ASY|) on valid go-trial.
As analysis of error rates in the joint part of the learning phase revealed significant differences between the four test phase groups regarding their learning phase performance (see Appendix A), we included error rates of the joint part of the learning phase (ER LearnJoint ) as a covariate in the statistical analysis of participants' test phase performance.Hence, all dependent test phase measures were analysed as a function of the four test phase conditions by means of seperate ANCO-VAs with Individual Compatibility (Individually Compatible vs. Individually Incompatible mapping) and Joint Compatibility (Jointly Compatible vs. Jointly Incompatible mapping) as between-subjects factors and error rates of the joint part of the learning phase (ER Learn- Joint ) as a covariate. 8

Open science statement
Sample size, data exclusion criteria, analysis plan and directed hypotheses were preregistered in the Open Science Framework.The preregistration is accesible online at https://osf.io/vk348.The inclusion of ER LearnJoint as a covariate in the analysis of participants' test phase performance was not preregistered as differences between the four test phase groups regarding their learning phase performance was unexpected.

Results
Results for the learning phase can be found in Appendix A. The results for the test phase analysis are depicted in Fig. 5.
121.Yet, opposite to our initial predictions, participants' response performance was less efficient (implying higher IES) with a Jointly Compatible mapping (estimated marginal mean [EMM] = 667 ms, 95% CI [634 ms, 701 ms]) compared to a Jointly Incompatible mapping (EMM = 592 ms, 95%CI [557 ms, 627 ms]).There was also a significant 7 This includes trials with missing responses by one and wrong responses by one or both pair members as well as trials in which pair members responded not synchronously enough.
8 For analysis of participants' individual performance measures in the test phase (IES, RTs and ER Test ), pair-level error rates in the joint part of the learning phase were treated as an individual-level variable for each pair member.
effect of the covariate (ER LearnJoint ), F(1,67) = 9.42, p = .003,η p 2 = .123,indicating that participants responded less efficiently in the test phase the more errors they had made together with their partner in the joint part of the learning phase (r(70) = .474,p < .001).Neither the main effect of the Individual Compatibility factor, nor its interaction with the Joint Compatibility factor were significant (both F < 1).Further explorative analysis of the Individual Compatibility factor by means of an independent samples Bayesian t-test showed that the IES data was BF 01 = 9.2 times more likely under the null hypothesis predicting the absence of a performance advantage in the individually compatible compared to the individually incompatible test phase conditions.The analysis of participants' mean RTs is displayed in Fig. 5B.Again, there was a significant main effect of the Joint Compatibility factor, F (1,67) = 12.6, p < .001,η p 2 = .159,butas for IESthe effect was in the opposite direction as predicted.RTs in the Jointly Compatible conditions (EMM = 576 ms, 95%CI [555 ms, 597 ms]) were slower compared to the Jointly Incompatible conditions (estimated marginal means = 519 ms, 95%CI [498 ms, 542 ms]).There was also a significant effect of the covariate (ER LearnJoint ), F(1,67) = 4.87, p = .03,η p 2 = .067,reflecting slower response times in the test phase the more errors participants had made together with their partner in the joint part of the learning phase (r(70) = .358,p = .002).Again, neither the main effect of the Individual Compatibility factor, nor its interaction with the Joint Compatibility factor were significant (both F < 1).Further explorative analysis of the Individual Compatibility factor by means of an independent samples Bayesian t-test showed that the RT data was BF 01 = 7.4 times more likely under the null hypothesis predicting the absence of a response time advantage in the jointly compatible compared to the jointly incompatible test phase conditions.
Analysis of participants' error rates on go-trials of the test phase (Fig. 5C) revealed no significant main or interaction effects (all F < 1).Only the effect of the covariate (ER LearnJoint ) on participants' test phase error rates was significant, F(1,67) = 5.01, p = .03,η p 2 = .07,reflecting higher error rates in the test phase the more errors participants had made together with their partner in the joint part of the learning phase (r (70) = .374,p = .001).
Analysing the frequencies of participants' erroneous responses on nogo trials revealed no significant main or interaction effects (all F < 1 after controlling for the effect of the covariate, F(1,67) = 3.168, p = .08,η p 2 = .05).

Discussion
The present study tested two possible accounts of how co-actors might learn and represent novel instrumental relationships between synchronized action contributions and the joint outcomes they produce together.According to a minimal account of joint action representations, action-outcome learning should only be sensitive to each co-actor's individual contributions to the joint action, which should have been reflected in a modulation of participants' test phase performance by means of our individual compatibility manipulation.According to a group-level account of joint action representations, action-outcome learning should be sensitive to relations between co-actors individual contributions to the joint action, which should have been reflected in a modulation of participants' test phase performance by means of our joint compatibility manipulation.Our results showed that joint but not individual compatibility manipulations of the instructed stimulus-response mappings affected participants' response performance in the test phase.However, the result pattern we observed was not in line with the directed predictions we derived from associative accounts of actionoutcome learning.

Action-outcome learning on the individual level
Against the prediction we derived from of the minimal account, we found no evidence that participants' test phase performance benefitted from a stimulus-response mapping that was compatible with the preceding learning phase regarding individual-level features of each coactor's isolated response contributions (left/right).This indicates that presentation of the two-tone chords in the test phase was unlikely to prime learned associations with individual-level features of participants' isolated response contribution.This would imply that participants did not represent the chord outcomes in the learning phase to be contingent on their individual action contributions alone.
We propose that the lack of an individual compatibility effect can be attributed to the peculiarities of the synchronous joint action setting investigated in the present study.First, during the joint part of the learning phase, participants produced distinct chord outcomes that were contingent on the compound of two synchronous response contributions of both co-actors.This may have obscured contingent relationships between perceived outcomes and the isolated response contributions of each individual co-actor.The finding that co-actors did not acquire action-outcome associations on the individual level may thus be explained by overshadowing, an effect that has been observed in research on classical conditioning (e.g., Pavlov, 1927) and human contingency learning (e.g., Dickinson, Shanks, & Evenden, 1984;Shanks, 1989).Research in these domains has shown that learners tend to become insensitive to contingency relations between two consecutive events (e. g., an individual action and a perceived outcome) if the preceding event occurs in compound with another potential predictor (e.g., the action of another co-actor).This interpretation would also be in line with recent studies providing direct evidence for limitations of associative actionoutcome learning in individual action contexts when the number of action and outcome possibilities increase beyond a limited set of simple one-to-one mappings (Flach, Osman, Dickinson, & Heyes, 2006;Watson et al., 2015).
Second, many studies in the domain of individual action indicate that learning and retrieval of associations between outcomes and low-level response features is not a quasi-automatic process but appears to be modulated by intentional and attentional factors (c.f., Herwig & Waszak, 2009;Kiesel & Hoffmann, 2004;Pfister, 2019;Vogel, Rudolf, & Scherbaum, 2020;Zwosta, Ruge, & Wolfensteller, 2013).These studies indicate that people link perceived outcomes to selective features of their actions that are determined by top-down interpretational processes defining how produced actions are currently encoded (see also Ansorge & Wühr, 2004).Thus, the missing evidence for action-outcome learning on the individual level may indicate that participants encoded their responses in the learning phase not simply as individual left/right responses.
As we attribute the lack of an individual compatibility effect in our study to the peculiarities of synchronous joint action, future studies may extend our research to other joint action scenarios in which individual and joint action contributions and their respective outcomes are more clearly distinguishable in space and time.This would be the case in sequential joint actions in which co-actors produce individual actions / action outcomes in turns with each other to produce more distal joint action outcomes over time (e.g., Sacheli et al., 2018).Thus, an interesting question for future research would be whether action-outcome learning remains sensitive to individual-level response features of coactors isolated response contributions in other joint action contexts that deviate from the special case of synchronous joint action investigated in the present study.

Action-outcome learning on the group level
Turning to the prediction we derived from the group-level account, we did not find evidence that participants' test phase performance benefitted from a stimulus-response mapping that was compatible with the preceding learning phase regarding group-level features of co-actors' joint response contributions (inner/outer).In contrast, we observed an unexpected effect in the opposite direction, reflecting a performance advantage for participants who received jointly incompatible stimulusresponse mappings in the test phase.This unexpected reversed joint compatibility effect speaks against the hypothesis that presentation of the two-tone chords in the test phase led to an automatic activation of associated group-level features of co-actors' joint response contributions.This finding indicates that the mechanisms proposed by simple associative accounts of goal-directed action (e.g., ideomotor theories) are not directly extendable to incorporate group-level features of joint action performance.
Nevertheless, the reversed joint compatibility effect still requires an explanation of why co-actors were reliably affected by alterations of group-level relations between their own and their partner's contributions to the joint action from learning to test phase.This finding implies 9 Degrees of freedoms were adjusted to the number of pairs.M. Marschner et al. that the action-outcome representations participants acquired in the learning phase must have been sensitive to group-level relations between their own and their partner's contributions to the joint action in some way.
A possible explanation of the reversed joint compatibility effect can be derived from theories postulating that acquired representations of action-outcome relationships are not stored as rigid bidirectional associations formed in long-term memory but as propositional knowledge structures represented in current working memory (Custers, 2023;Mitchell, De Houwer, & Lovibond, 2009;Seabrooke, Hogarth, & Mitchell, 2016;Sun, Custers, Marien, Liefooghe, & Aarts, 2022).According to these accounts, propositional representations of actionoutcome relationships acquired in the learning phase could have influenced participants' test phase performance, not by priming responses associated with a former action outcome on a trial-by-trial level, but by modulating how participants translated the task instructions at the start of the test phase into task-relevant stimulus-response rules held in procedural working memory during the ensuing task (i.e., task sets, c.f., Brass, Liefooghe, Braem, & De Houwer, 2017;Hazeltine & Schumacher, 2016;Monsell, 2003;Rogers & Monsell, 1995).
Thus, the reversed joint compatibility effect may have arisen because participantsas a function of their previous learning experienceperformed the test phase in the respective conditions with different tasksets in mind, encoding the ambiguously instructed stimulus-response mappings at the start of the test phase either in relation to their group-level performance (i.e., if WE hear a high/low chord, WE press our inner/outer buttons) or merely in relation to their individual-level performance instead (i.e., if I hear a high/low chord, I press my left/ right button).
Specifically, pairs who received jointly compatible test phase instructions might have been inclined to encode the instructed stimulusresponse mappings in relation to their group-level performance, as a group-level construal of the task instructions could be most easily reconciled with their previous group-level performance in the learning phase.In contrast, pairs who received jointly incompatible test phase instructions may have been reluctant to encode the instructed stimulusresponse mappings in relation to their group-level performance, as a group-level construal of the task instructions stand in conflict with their previous group-level performance in the learning phase.Due to this conflict during task-set formation (c.f., Monsell, Taylor, & Murphy, 2001), participants receiving jointly incompatible test phase instructions may have reverted to encode the instructed stimulus-response mappings in relation to their individual-level performance instead to ensure efficient response performance in the upcoming task. 10 Following this explanation, the reversed joint compatibility effect could stem from the fact that implementing stimulus-response rules encoded on a group level (i.e., in the jointly compatible test phase conditions) raises higher cognitive demands compared to implementation of stimulus-response rules encoded merely in relation to people's individual-level performance (i.e., in the jointly incompatible test phase conditions).
First, a possible reason for this could be that the implementation of stimulus-response rules encoded on a group level may require an additional processing step that specifies people's response contribution not only at the group level but at the individual level as well (e.g., WE press our inner buttons, so I press my left/right button).This assumption of a hierarchical specification of joint action representations would be in line with current theoretical models of joint action planning implicating a cascading processing hierarchy that proceeds from higher-level action representations related to group-level performance to lower-level action representations related to the individual-level action contributions of the separate co-actors (Candidi, Sacheli, & Aglioti, 2015;Keller et al., 2016;Pacherie, 2012;Pesquita et al., 2018;Sacheli et al., 2018;Sinigaglia & Butterfill, 2022;Zapparoli, Paulesu, Mariano, Ravani, & Sacheli, 2022).Therefore, participants' may have performed worse in the jointly compatible compared to the jointly incompatible test phase conditions because they accessed individual-level representations of their own response contributions only indirectly, mediated through a higher-order action representation at the group level.In contrast, participants in the jointly incompatible test phase conditions would have performed better because they were able to access individual-level representations of their own response contributions directly without requiring an additional processing step.
Second, implementing stimulus-response rules encoded on a group level may also involve further specification of individual-level response features of other co-actors' contributions to the groups' response as well (e.g., WE press our inner buttons, so I press my left/right button and YOU press your right/left button).This would be suggested by research findings demonstrating that people tend to co-represent others' tasks and actions (Atmaca et al., 2008;Kourtis, Sebanz, & Knoblich, 2013;Novembre, Ticini, Schütz-Bosbach, & Keller, 2014;Schmitz et al., 2018;Sebanz et al., 2003Sebanz et al., , 2005) ) as long as they are perceived as co-acting partners contributing to a joint task (Kourtis, Sebanz, & Knoblich, 2010;Meyer, Hunnius, Van Elk, Van Ede, & Bekkering, 2011;Sacheli et al., 2018Sacheli et al., , 2021)).Thus, participants may have performed worse in the jointly compatible compared to the jointly incompatible test phase conditions because they faced additional processing costs associated with the specification of their partner's response contributions that would have been absent in participants in the jointly incompatible test phase conditions who encoded the stimulus-response rules merely in relation to their own individual-level performance.

Conclusion
Our study investigated how co-actors acquire joint action representations through repeated experience of novel instrumental relationships between their synchronized actions and the joint outcomes they produce together.While our results reveal limitations of purely associative accounts in explaining action-outcome learning in synchronous joint action contexts, they can be explained by propositional accounts of actionoutcome learning and support the idea that joint action is guided by dedicated mental representations encoding group-level relations between co-actors' joint action contributions.Taken together, our study informs current theorizing on the underlying mechanisms and effects of action-outcome learning in individual and joint action settings and provides novel evidence that co-actors tend to form group-level representations of their joint actions if afforded by given task constraints.

CRediT authorship contribution statement
Maximilian Marschner: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writingoriginal draft.David Dignath: Methodology, Supervision, Writingreview & editing.Günther Knoblich: Conceptualization, Funding acquisition, Investigation, 10 Notably, this idea would be in line with Vallacher and Wegner's (1985, 1987, 2012) theory of action identification which formulates basic principles of how people conceptualize their actions.The theory assumes that people can conceptualize their actions at different hierarchical levels of abstractions, ranging from lower-level interpretations related to specific movements they perform to higher-level interpretations related to more distal ends of their actions.At which level people construe their actions is thought to be determined by three interconnected principles, stating that 1) a prepotent level of action identification is maintained unless 2) an action can be conceptualized at a higher level, creating a tendency to change to that higher level or 3) an action cannot be performed in terms of the prepotent level of action identification, creating a tendency to revert to a lower level.Applied to our study, the theory predicts that co-actors would tend to conceptualize their actions at the highest level of abstraction afforded by the task (i.e., in terms of their group-level performance) until it would create problems for efficient task performance, at which point co-actors would tend to construe their actions at a lower level instead (i.e., in terms of their individual-level performance).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Illustration of two possible representational structures that may result from action-outcome learning in joint action contexts.(see main text for explanation).

Fig. 4 .
Fig. 4. Illustration of the trial timeline for the joint part of the learning phase (A) and for the test phase (B).A) shows a valid trial in the joint learning phase.B) shows a test phase trial in which one participant produced a response error.Blue bars represent 1000 ms response windows in which responses were recorded in both phases.The trial timeline for the solo part of the learning phase was identical to A) despite minor differences (see main text).(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig. 5. Condition means of participants' inversed efficiency scores (IES) (A), mean response times (RT) (B) and error rates (C) and of pairs' mean absolute response asynchronies (|Asy|) (D) in the test phase.Condition means represent estimated marginal means adjusted for inclusion of ER LearnJoint as a covariate in the comparison.Error bars represent the respective standard errors of the mean.Significant main effects are marked with asterisks (** p < .01;*** p < .001).
(Morey, 2008)ans and 95% confidence intervals of all dependent measures for each cell of the 2 (Individual Compatibility: Ind. Compatible vs. Ind.Incompatible) x 2 (Joint Compatibility: Joint Compatible vs. Joint Incompatible) x 2 (Block Order: Joint Compatible 1st vs. Joint Incompatible 1st) experimental design with Joint Compatibility manipulated within-subjects and Individual Compatibility and Block Order manipulated between-subjects.IES = inverse efficiency scores; RT = response time; ER = error rate; |Asy| = absolute response asynchrony between pair members.Confidence intervals are corrected for within subjects designs(Morey, 2008).IES = inverse efficiency scores; RT = response time; ER = error rate; |Asy| = absolute response asynchrony between pair members.

Table B
Note.All dependent measures were subject to seperate 2 (Individual Compatibility: Ind. Compatible vs. Ind.Incompatible) x 2 (Joint Compatibility: Joint.Compatible vs. Joint.Incompatible) x 2 (Block Order: Joint.Compatible 1st vs. Joint.Incompatible 1st) mixed ANOVAs with Joint Compatibility as within-subjects factor and Individual Compatibility and Block Order as between-subjects factors.IES = inverse efficiency scores; RT = response time; ER = error rate; |Asy| = absolute response asynchrony between pair members.Significant effects are marked in bold font.