Task-dependent coarticulation of movement sequences

Combining individual actions into sequences is a hallmark of everyday activities. Classical theories propose that the motor system forms a single specification of the sequence as a whole, leading to the coarticulation of the different elements. In contrast, recent neural recordings challenge this idea and suggest independent execution of each element specified separately. Here, we show that separate or coarticulated sequences can result from the same task-dependent controller, without implying different representations in the brain. Simulations show that planning for multiple reaches simultaneously allows separate or coarticulated sequences depending on instructions about intermediate goals. Human experiments in a two-reach sequence task validated this model. Furthermore, in co-articulated sequences, the second goal influenced long-latency stretch responses to external loads applied during the first reach, demonstrating the involvement of the sensorimotor network supporting fast feedback control. Overall, our study establishes a computational framework for sequence production that highlights the importance of feedback control in this essential motor skill.


Introduction
Everyday behavior involves performing sequences of actions to achieve a given goal.
Sequential motor skills depend on the ability to combine individual actions with short-term goals into a longer sequence consisting of multiple goals (1).However, the computations that underlie the combination of sequence elements and the neural circuits where this combination occurs remain a topic of debate (2)(3)(4).
Behavioral data from various sensorimotor repertoires indicate that the trajectory of sub-movements in a sequence is affected by multiple future goals (5)(6)(7)(8)(9)(10).For example, in human speech and bird song, the pronunciation of individual elements is influenced by subsequent elements (11,12), a phenomenon known as coarticulation.Ramkumar and colleagues (13) employed a computational model, revealing that concatenating multiple sub-movements into larger units, referred to as chunks (14,15), induces coarticulation within the velocity patterns of sub-movements.This coarticulation serves as a natural solution to enhance the efficiency of chunked movements.Collectively, behavioral findings highlight a pattern where sub-movements depend on the whole sequence, or a chunk.
It remains unclear how future goals are integrated in the sensorimotor system.For rapid execution of a sequence, one possible solution is to represent multiple goals within low-level control circuits (3,16), enabling the execution of several elements as a single entity, called "motor chunk".Note that chunking can also occur at a higher level such as in working memory-guided sequences, which in this case may or may not involve the production of a movement (17,18).Recent neural recordings in the primary motor cortex (M1) have shown no specific influence of future goals on the population responses governing ongoing action (19,20).Specifically, Zimnik and Churchland (20) observed in a two-reach sequence task that, there was no coarticulation in sub-movement kinematics although the execution got faster with practice.Notably, M1 displayed separate phases of execution related activity for each submovement.Using a neural network model, they interpreted that sequence goals could be separated and serially specified to the controller from regions upstream of M1 (Figure 1A).These findings contrast with earlier studies showing coarticulation of sub-movements and whole sequence representations in M1 (21)(22)(23).As a result, it has been suggested that coarticulation and separation in rapid sequences may involve distinct computations: coarticulation possibly involves replacing sub-movements with a motor chunk, while separation possibly indicates independent control of each sub-movement with chunking at a higher-level (4,20).Thus, there are unresolved questions regarding why sequential movements sometimes coarticulate, and how the representation of future goals in the sensorimotor system influences the way sequences are executed.
Strikingly, most studies have emphasized potential differences in the state of the sensorimotor system during preparatory activity (20)(21)(22)(23)(24).As a consequence, the role of feedback control in this debate has been unexplored, although it is known that human motor control involves continuous processing of sensory feedback (25,26).The theory of optimal feedback control (OFC) has been particularly useful in predicting the influence of numerous task parameters on the controller (27)(28)(29)(30)(31)(32)(33)(34), thus reproducing goal-directed motor commands during both unperturbed movements and feedback responses to disturbances (30).OFC has been used in numerous studies to interpret flexible feedback responses occurring in the longlatency response period (30,35).In this framework, all targets in a sequence can be considered simultaneously by the controller, and the overlap between control gains applied to each target determines when they influence the sequence (Figure 1B).The control gains themselves depend on task requirements (36)(37)(38).Thus, OFC predicts two experimentally testable features of sequence production: first, even if the sequence is considered as a whole at the level of controller, control gains can show different amounts of overlap based on task requirements, leading to either coarticulation or separation of the sequence elements; second, in sequences with strong co-articulation, feedback responses to perturbations are expected to depend on future targets due to the overlap of control gains.
Here we studied these predictions of OFC in a two-reach sequence task.Simulations revealed that the instruction to slow-down at the first target influenced the coarticulation of the two reaches.Importantly, the penalty for velocity at the first target produced an apparent separation of two reaches, even if the controller was always computed jointly for both reaches.We conducted two experiments with human participants and observed that their behaviour was fully consistent with the simulations.We observed both flexible coarticulation dependent on task instruction about the intermediate target, as well as a modulation of long-latency feedback responses to the mechanical loads applied during the first reach dependent on location of the second target when coarticulation was present.
These results demonstrate that both coarticulation and separation of sub-movements may arise as the consequence of efficient feedback control.Recordings of muscle activity suggest that this operation is carried out by the sensorimotor network that mediates fast state-feedback control.

Results
The results section is organized as follows: we first introduce the computational model and

Simulations of sequence execution
The model considers a task dependent state-feedback controller derived in the framework of stochastic optimal control (30) .The controller was optimized to produce translation of a point mass on a horizontal plane.A given estimate of a state vector  � (position, velocity, target location and external load) can influence the motor output  through multiplicative functions called 'feedback control gains', represented by ().Typically, the motor output at any time () is a result of a state feedback control law: Note that the control law is dependent on the state estimate ( �), which is computed by integrating delayed and noisy sensory feedback () with internal prediction of the state.
Importantly, feedback gains are tuned to task demands (36,39), and vary within a movement (37).This task dependency gives rise to instances where all state variables ( �) are simultaneously available to the feedback controller, but some of them may not influence the motor output if deviations of the state vector in the corresponding dimension do not impact the performance.
We applied this control formalism to a task composed of a two-target sequence of reaching movements, such that two different sequences shared the first target, after which they diverged towards the second target (T1 and T2, Figure 1C).In the model, the estimated positions of the targets (1 � , 2 � ) are in the state estimate ( �), hence the control law can be decomposed and expressed in the following equation (also see Figure 2A): where ( 1 ,  2 ) are the feedback gains that determine how much and when each target (1 � , 2 � ) influences the motor output (Figure 2A).This formulation highlights that, even when both targets are simultaneously available to the feedback controller, the overlap between control gains ( 1 and  2 ) will determine if and when the second target influences the first reach (Figure 2B).In the simulations, the control gains and the overlap between the two components of the sequence were not hardcoded but were found as an optimal control solution that improves the efficiency of the sequence as a whole (i.e., as a solution to holistic sequence planning in the sense that all targets are specified to the controller at once).For holistic sequence planning, we formulated the efficiency in terms of a cost function on the terminal movement errors and speed of reaching at both targets (T1 and T2), and total motor output across movement duration (see model with terminal costs in Methods).The model simulations revealed that, even though the sequence was planned as a whole, the necessity to slow-down at the first target, which was induced by an intermediate cost on speed at the corresponding time, influenced when the first reach became sensitive to the second target (Figure 2C, D), i.e., when the two reaches in the sequence coarticulate.
When there was no constraint on speed at the first target, the simulated hand reached the first target with a high velocity (Figure 2C, left panel), and subsequently continued towards the second target at a high initial speed of > 0.2 m/s.In this rapid transfer scenario, the control gains showed a cross-over effect where the control gain associated with the second target (k2) was non-zero, albeit small, during the first reach, showing an early overlap with the feedback gain (k1) (Figure 2C, middle panel).As a result, the location of the second target influenced the lateral hand deviation of the first reach (Figure 2C, right panel).The first reach was thus different depending on the location of the second target.Notably, the first reach was similarly influenced by the second target even in a different model where temporal buildup costs were considered instead of terminal costs (see temporal buildup cost model in Methods), to penalize the hand deviation from the targets across the duration of the sequence (Supplementary Figure S1A).
Because the control gains also determine the response to external perturbations, the trajectories following external loads applied to the mass were also dependent on the second target (Supplementary Figure S2A).Interestingly, the lateral deviation along the perturbed direction was smaller when the second target was in the direction of perturbation.Although this may appear counterintuitive, the model shows that it is an efficient solution, because a higher acceleration linked to reversal at T1, when the perturbation is in the direction of T2, incurs a higher motor cost.Hence, a smaller hand deviation is necessary to avoid high acceleration at T1.This result qualitatively persisted across broad variations in parameters related to the costs, delays and point-mass dynamics (not shown).A similar result was observed when we used temporal buildup of kinematic error in the cost function (Supplementary Figure S2B), instead of using the simple terminal costs, matching human data closely (see the experimental results).
To the contrary, the requirement to slow down at the first target, captured by an increased cost on speed, reduced the velocity at the end of first reach and altered the shape of the trajectories (Figure 2D, left panel) compared to the rapid transfer scenario (Figure 2C, left panel).The control gain corresponding to the second target became non-zero at a later time, at the end of the first reach (Figure 2D, middle panel).Due to the later increase in this gain value, the influence of the second target on the motor output during the first reach was much reduced, hence generating similar hand path to reach the first target across the two sequences (Figure 2D, right panel).Similarly, feedback responses to lateral perturbations that were applied early-on in the first reach were not dependent on the location of the second target (Supplementary Figure S3A).The amount of lateral deviation when an unexpected perturbation was applied did not vary as a function of the direction of the second target.Qualitatively similar influence of the need to slow down at the first target on the trajectory of first reach was observed when temporal buildup costs were included instead of terminal errors (Supplementary Figure S1B and S3B).
It is worth noting that the OFC model can be generalized to longer sequences (10) through the incorporation of additional cost terms (in Equation 10 of Methods) and targets, enabling simultaneous planning for more than two targets.Simulations of a sample threereach sequence (Supplementary Figure S4) revealed that, varying the cost of dwell velocity at intermediate targets ( 2 and  3 parameters in Methods) caused a variation in control gains.Different amount of change in control gains can be expected for intermediate versus late targets (Supplementary Figure 4A).Notably, even when we used the same dwell velocity cost ( 2 =  3 = 0), the observed velocity profiles were different between the two sequences towards different final targets (T3 up and T3 down) (Supplementary Figure 4B).We tested a condition in which both sequence reaches were forced to have similar dwell velocity profiles by increasing the dwell velocity costs in the sequence towards one of the targets (T3 down), while leaving this parameter unchanged for the other target (T3 up).In this scenario, T3 up sequence had the parameters ( 2 ,  3 ) = (0, 0), while T3 down sequence had the parameters (0.8, 0.8).In this case, the curvature of the first reach was different, and predominantly occurred due to differences in  2 between the two sequence reaches (Supplementary Figure S4C).These simulations highlight that, planning for a longer horizon sequence can indirectly influence the curvature of early reaches, due to the interaction between intermediate dwell constraints, spatial arrangement of targets, and sequence horizon in a task dependent manner.
Overall, the model predicted that even if a feedback control policy was computed by optimizing the whole sequence over a long time-horizon, the requirements associated with intermediate goals determine how early in the sequence the second (future) target can influence the feedback controller (compare Figure 2C and Figure 2D; Supplementary Figure S1A and S1B; Supplementary Figure S2 and S3).

Correspondence between Human Behaviour and Model Predictions
The experiments were designed to verify whether task requirements at the intermediate goals influenced hand trajectories and feedback control in a way that was expected from the model simulations.The first experiments tested the scenario in which there was no penalty on velocity during the stop-over at the first target, referred to as the GO task.We instructed 15 human participants to perform a two-target sequence of reaching movements (Figure 1C-F).Both targets in a given sequence were displayed from the beginning of the trial, allowing for preparation of both reaches.After receiving a GO cue (500-1000 ms after target display), participants were required to reach the first target within a prescribed time (500 -650 ms), and then to proceed to the second target.There was no instruction to stop at the first target.
To encourage fast transfer between the two reaches, the participants were allowed to remain in the first target for a maximum of 150 ms after a successful reach, before initiating the second reach, otherwise the second target disappeared.To determine if the feedback controller was tuned to the location of the second target, mechanical loads (spanning 200 ms width) in randomized direction (rightward or leftward) were applied randomly in 20% of the trials soon after the hand left the starting location (4.5 cm in forward direction from the starting location (this occurred typically at ~200ms after the movement onset) The model prediction about the influence of the second target during the first reach was clearly borne out of the experimental results.Figure 3A illustrates the average hand speed across baseline trials in the rapid transfer scenario.It is clear that the hand did not come to rest at the end of first reach, but continued towards the second target at a relatively high transfer speed of ~0.2 m/sec.The hand speed at the first target was slightly different between the two second target conditions (Figure 3A, inset; paired t-test: t(14) = -3.8,p = 0.002).In line with model predictions, in the rapid transfer scenario, the location of the second target influenced the hand path during the first reach (Figure 3B).Statistically, across trials and participants, lateral hand deviation during the first reach was significantly different between the two sequences (Figure 3B, inset; paired t-test: t(14) = -5.9,p < 10 -4 ).

Figure 3 Hand kinematics in the GO task (experiment 1). (A) hand speed and (B) hand path in unperturbed baseline trials. (C, E) same as (A) but for perturbation trials. (D, F) same as (B) but for perturbation trials. (G) Baseline EMG response of PEC and PD muscles (H) Perturbation response (baseline subtracted) of PEC and PD muscle during
Note that the experiments involve a block design where a sequence direction was maintained for 10 trials (with 2 randomly interleaved rightward and leftward perturbation trials), while the blocks themselves were randomly interleaved (Figure 1E).To test whether the influence of second target on hand deviation was due to trials occurring late within a block, we computed the hand path in the first and last trials and averaged across all blocks for each participant.Group data in Supplementary Figure S5 illustrate that the hand deviation was significantly different between the leftward and rightward sequence conditions from the very first trial within a block (paired t-test: t(14) = 5.7, p < 10 -3 ).
Similarly, the hand deviation was significantly different between leftward and rightward sequences in the last trial within a block (paired t-test: t(14) = 5.9, p < 10 -3 ).Additionally, a temporal effect was observed in the leftward sequence condition, where the last trial incurred a higher hand deviation compared to the first trial within blocks (paired t-test: t(14) = 3.9, p = 0.003).Note that the above results include correcting for multiple comparisons using Holm-Bonferroni method.Overall, the block design of the experiment may lead to small differences between first and last trials within a block, but the effect of second target on the lateral hand deviation was present from the very first trial within a block, suggesting that qualitatively similar results can be expected in experiments without a block design.
During perturbation trials, hand trajectories during the first reach were influenced by the location of the second target.In rightward perturbation trials, the dwell velocity and hand deviation (Figure 3C, D) were significantly different across the two second target conditions (paired t-test: t(14) = -7.1,p < 10 -5 for dwell velocity; t(14) = -7.9,p < 10 -5 for hand deviation).Similar differences across the two sequences were observed in the leftward perturbation trials (Figure 3E, F) (paired t-test: t(14) = 6.9, p < 10 -5 for dwell velocity; t(14) = -8.2,p < 10 -6 for hand deviation).The influence of second target on the lateral hand deviation was qualitatively similar to that observed in model simulations, and counterintuitive to what we might expect without the help of the model simulations.As observed in the model simulations (see also Supplementary Figure S2), lateral hand deviation was smaller when the perturbation was in the direction of the second target (T2) and vice-versa.This was consistent for both rightward and leftward perturbation conditions.
Both the model and humans expressed this strategy that can be seen as an emergent feature of efficient feedback control during production of movement sequences.Additionally, even though behavior was reproduced in simulations, changing the cost on control effort and/or accuracy of intermediate reaches could modulate the sequence-dependent changes in curvature.
We identified a pair of muscles in the upper-arm shoulder joint (Pectoralis Major, PEC, and Posterior Deltoid, PD) that were strongly recruited by the lateral perturbation (see earlier studies with a similar experimental setup (34,40)).These muscles also steer the hand towards the first and second targets in unperturbed condition.PEC muscle displayed agonist burst coinciding with acceleration of hand towards the first target, while the PD muscle provided the antagonist burst to decelerate the hand near the first target.Notably, both muscles displayed a clear difference in baseline EMG activity during the first reach, depending on whether the following reach involved right or left second targets (Figure 3G).

Sequence-dependent long-latency feedback responses
We assessed when the muscle stretch and shortening responses show dependency on the second target.If the two targets were simultaneously used to form an efficient feedback controller for the whole sequence, one would expect that the sensitivity of the response to the second target to become visible in the long-latency epoch, considering responses in this epoch as a proxy of flexible, state-feedback controller in the primate nervous system (30,35).Overall, the EMG responses indicated a significant influence of the second target on the long latency responses in both stretched and shortened muscles (Figure 4).When perturbed in the rightward direction, PEC muscle (agonist) response increased in both sequence conditions, which counters the perturbation by increasing the leftward pull.Given the lower hand deviation in the rightward sequence (Figure 3D), we asked if the reduced deviation was due to a larger leftward pull from the PEC muscle.Indeed the PEC stretch response was significantly larger in the rightward sequence than the leftward sequence in the long-latency epoch, in both R2 (one tail paired t-test: t(14) = 3.5, p < 0.002) and R3 (one tail paired t-test: t(14) = 2.25, p = 0.02).On the other hand, the PD muscle shortening response was significantly larger (below baseline) in the rightward sequence than the leftward sequence in both R2 (one tailed paired t-test: t(14) = -3.2,p = 0.003) and R3 epochs (one tailed paired t-test: t(14) = -3.5, p = 0.002).When the perturbation load was to the left, the PEC shortening response (below baseline) was significantly larger in the leftward sequence compared to the rightward sequence in both R2 (one tail t-test: t(14) = 7, p < 10 -5 ) and R3 epochs (t(14) = 8.1, p < 10 - 6 ).Such a large unloading response in the PEC muscle could facilitate a larger correction in the leftward sequence by reducing the leftward pull in the direction of the applied load.
While the shortening response in PEC was larger, the stretch response in PD muscle was not significantly larger in the leftward sequence throughout the long-latency time period (one-tail paired t-test; R2 epoch: t(14) = 2.8, p = 0.99; R3 epoch: t(14) = -0.13,p = 0.44).
On the contrary, the PD stretch response was significantly smaller in the leftward sequence in the R2 epoch (one-tailed t-test: t(14) = 2.8, p = 0.006).All muscles exhibited differences in the voluntary time window (120 -180 ms) during the first reach depending on the location of the second target (two-tailed paired t-test: t ( 14) absolute value between (3.5 -7.4), and p-value between (1e -6 -0.012) across muscles and perturbation types).
Next, we verified that the differences reported above did not follow from mere differences in baseline muscle activities at perturbation onset.Muscle responses exhibit a property called 'automatic gain scaling' where the stretch response increases proportionally to the pre-perturbation muscle activity (41).If gain-scaling was significant, then we expected to observe a difference in the R1 period response between the two sequence conditions.We did not observe any sequence-related modulation of the stretch responses in the R1 epoch, suggesting that any potential differences at baseline did not have a strong impact.It is also important to realize that the sequence direction that involved a larger stretch response in the corresponding muscle was always associated with lower levels of activity in the baseline condition, thus potential biases induced by baseline differences were in the direction opposite to the effect reported above.
Overall, the results indicate, when required to perform movement sequences rapidly, the sensorimotor circuits that mediate long latency feedback responses showed modulation that was dependent on the secondary target.The specificity of controller to the second target was evident when perturbations were applied early-on during the first reach.These observations agree with the predictions of the feedback control model that, with low or absent penalty on velocity at the first target, the second target influences the hand path and feedback responses to perturbations prior to the first target.

Separation of Sequence Elements
Another clear prediction of the model is that when there is a cost on velocity at the first target, the first reach should be less dependent on the second target.To test this prediction experimentally, we performed a STOP task where 14 subjects were asked to perform the same two-target reaching movements with a different speed constraint (Figure 1F).They were required to bring the hand to a near-complete halt (speed < 0.5 cm/s) before starting the second reach.
Figure 5 illustrates that the hand speed at the first target dropped to near zero levels in all unperturbed, leftward and rightward perturb conditions.Clearly, there was no influence of the second target on the velocity of reaching the first target (two-tailed paired t-test: t(13) = -1.9,p = 0.08 in the baseline condition; t(13) = -0.01,p = 0.99 for rightward perturbations; t(13) = 2.1, p = 0.055 for leftward perturbations).The hand deviation during the first reach was similar between two sequences (two-tailed paired t-test: t(13) = -1.8,p = 0.08 in the baseline condition; t(13) = -1.2,p = 0.26 for rightward perturbations; t(13) = -1.2,p = 0.25 for leftward perturbations).Moreover, the unperturbed EMG responses of the PEC and PD muscles were similar throughout the first reach, and diverged only after the end of first reach, while moving towards second targets (T2).In rightward and leftward perturbation conditions, the perturbation responses (ΔEMG) of the PEC and PD muscles (Figure 5D, E and Figure 6) were not statistically different in pre-perturbation, R1, R2, or R3 epochs.Only the PEC perturbation response, to a leftward perturbation, showed a weak effect of the second target location (R2 epoch: t(13) = 2.1, p = 0.055; R3 epoch: t(13) = 2.44, p = 0.03).To conclude, these results suggested that the second target did not influence the first reach when the hand was required to slow-down at the first target.A straightforward interpretation could be that the stopping at the first target invoked a completely different strategy in which the control of the two reaches was performed independently (Figure 1A), effectively separating the two movements, whereas executing them rapidly could produce the merging of the two sub-movements into a coarticulated sequence.While this is conceptually valid, it is not necessary and the model provides a more nuanced view: both apparent separation or coarticulation of the two motor patterns can be explained within the same framework of flexible feedback control.These different modes of sequence execution still require proper specification of the task constraints in the model, such as number of intermediate steps, dwell-time, or velocity limit.Such specifications must be considered as input to the controller.

Discussion
The results presented here are consistent with recent behavioral studies showing that the information about future targets in a sequential task is integrated at the stage of motor planning (6,7,10,42,43).Our key contribution is two-fold: we showed from computational and behavioral perspectives that the coarticulation and separation of sequence components depended on task instructions, and likely reflected efficient solutions of the same continuous control problem with distinct penalties at the intermediate goals; second, the results highlighted that long-latency feedback control was modulated by future targets in a coarticulated sequence, consistent with theories suggesting that movement planning is synonymous with derivation of a control policy (44,45).
A key question the model and experiments addressed was whether the integration of multiple goals required processing in upstream voluntary circuits, indicating timed, voluntary recall of future goals (20,46,47).Instead, our results support the idea of flexible and simultaneous processing of sequence elements, where low-level controller plays a crucial role in combining these elements.Our model based on optimal feedback control showed that even if two reaches are planned as a whole, the second target does not necessarily influence the controller early.The constraints placed on swiftness of transition around the first target modulated when the second target started to influence the sequence.This was clearly apparent in the hand deviation during the first reach, and further revealed by perturbing participants' hand with external loads, evoking sequence-dependent longlatency responses in Experiment 1.Consistent with this model (Figure 1B), recently an artificial neural network model with simultaneous processing of targets reproduced the population activity of primary motor cortex (see (48)), a region that contributes significantly to long latency feedback responses (35,49).Together, these studies point to the importance of considering feedback gains, and suggest a special role of transcortical feedback through M1 in the production of movement sequences.Notably, in the framework of optimal feedback control, an intermediate goal is equivalent to a via-point that constrains the execution of the sequence (similar to ( 13)).It is thus possible that coarticulation in motor systems be processed similarly as other kinds of movement constraints, such as via-points, avoiding obstacles, or changes in control policies.
Although OFC has been predominantly used as a behavioral level framework agnostic to neural activity patterns, it can shed light on the planning, state estimation and execution related computations in the transcortical feedback pathway (50).Using OFC, our study proposes a novel and precise definition of the difference to expect in neural activities in order to identify coarticulated versus independent sequence representations from a computational point of view.Because each condition (i.e., overlapping versus nonoverlapping controllers as in Figure 2) was associated with different cost-functions and time-varying control gains, it is the process of deriving these control gains, using the internal representation of the task structure, that may differ across coarticulated and separated sequence conditions.To our knowledge, how and where this operation is performed is unknown.A corollary of this definition is that the preparatory activity (20,51) may not discern independently planned or coarticulated sequences because these situations imply different control policies (and cost functions), as opposed to different initial states.
Moreover, the nature of the sequence representation is potentially not dissociable from its execution for the same reason.
Many studies highlight a phenomenon called chunking, where a subset of individual elements of a sequence are concatenated and represented as a unified whole called chunks (13,14,24,52).In the pursuit of finding motor chunks, recent neural recordings explored for holistic (or) unified representations in the circuits that produce motor commands, and found no such representations (4,19,20,53).Hence it was suggested that chunking resulted from a cognitive or high-level recall of intermediate goals sent to the controller for independent execution (4) (see Figure 1A)(but, see (54)).On the contrary, our developments also provide a computational definition of motor chunks.From our model, motor chunking (or a unified representation) does not necessarily imply a single (unified) preparatory period signal representing multiple goals.Motor chunking may represent the formation of a composite cost function, hence a composite control policy, that can jointly optimize multiple sub-movements (see Figure 1B).In this view, motor chunking results from an acquired motor ability to form overlapping control policies for contiguous sequence elements (Figure 1B and Figure 2).Importantly this property depends on the instruction at intermediate goals, thus even a motor chunk may falsely appear to be independently controlled (Figure 1A), masking the signature of sequence representation in neural data even if the controller was formed at once by considering multiple goals.
There are differences between the model and human movements that requires further study, such as the influence of sequence direction on the dwell velocity, which was not observed in the model simulations (compare velocity plots in Figure 2C and Figure 3A).A limitation is that nonlinear dynamics were ignored in our model.However, the potential impact of non-linear mechanics on limb trajectory could be completely suppressed by participants in Experiment 2, since we observed straight reach path, thus making the curvature observed in Experiment 1 more likely related to the task constraints rather than the biomechanics which did not differ between experiments.Yet, the quantification of how biomechanical factors influence control may deserve a dedicated work, as it has been shown that biomechanics can also influence coarticulation in a speech production task (55).Despite the limitations, the simple linearized OFC models were crucial in determining the relationship between three properties of the task demands, its impact on coarticulation, and feedback control.Our approach is consistent with how ideal actor/estimator models have been predominantly employed as normative models in formalizing an initial hypothesis, and simulating these to generate testable predictions (34, 36, 38-40, 50, 56-61).Such normative models should be distinguished from detailed biomechanical models that can trade-off interpretability and predictability for achieving better quantitative fits.Furthermore, the main feature of the model -to generate a task-dependent motor output as a combination of state feedback -is not limited to linear controllers and linear body dynamics, but applicable even when the control policy is non-linear (see (62,63)).
The modeling and empirical results presented here describe the computations underlying a two-reaches sequence.Although two is admittedly the lowest possible length of a non-trivial sequence, it enables comparison with similar tasks that have been used in recent neurophysiological studies (20)(21)(22)48) to understand the neural computations underlying the control of sequences.We expect that longer sequences be formed in a similar way as we showed that there is no theoretical constraint on the number of intermediate goals in the model and the addition of sequence elements is straightforward.Thus, we expect the combination of two elements can be the basis of longer sequence formation constructed on the same principles.
Besides the implication of our computational model of co-articulation, it has been often documented that swift execution of sequences was achieved by habituation, where a set of sequences are practiced for several days to months (13,19,20,23,53,54,64,65).
Such process potentially involve a slow time-scale linked to the acquisition of a motor skill (65,66), which differs from the results presented here, as the participants were able to execute accurate sequences after a pre-training of around 100 repetitions.Here, the participants received explicit information about the dwell constraints at the intermediate goal, resulting in quick coarticulation (also see (9,10,(67)(68)(69)).This is in line with taskdependent modulation of existing reaching controllers, which can be rapidly switched from one trial to another (34,70).This suggests that there exist several mechanisms operating at different time-scales and responsible for combination of sequence elements.One mechanism is a slow habituation process in which the subject learns de novo to build a novel controller for different elements in the sequence (54,71), whereas the other mechanism is a fast modulation of this consolidated controller by task demands.In principle, the two mechanisms are dissociable with perturbations, which may allow future studies to investigate how they interact and to identify which brain regions mediate these components of skilled motor behaviour.

Experiment Design
A total of 29 volunteers (11 men, 18 women, age between 20 and 35 years) participated in 2 experiments (n = 15 with 9 women in Experiment 1; and n = 14 with 9 women in Experiment 2).All participants were neurologically healthy, right-handed and gave their informed consent according to a protocol approved by the ethics board at UCLouvain.Experiments did not exceed 2 h, and participants were compensated for their time.We were most interested in comparing how the feedback responses are influenced by the location of the second target, even though the perturbations were applied in the beginning of the first reach.Accordingly, we calculated the average EMG response of each participant in the perturbed trials and subtracted this with the average EMG responses across unperturbed trials for each block within a set.We refer to this quantity as perturbationrelated muscle activity (ΔEMG).Paired t-tests were performed using group average to determine if the location of second target had any statistical effect on the perturbationrelated muscle activity.It should be noted that corrections for multiple comparisons do not apply here for two reasons: first, the samples at each epoch are involved in only one comparison; second, consecutive samples are not statistically independent.If there is a significant difference at a given epoch, it is very likely that there will be a significant difference in the next epoch because signals do not vary instantaneously.Hence, the risk of false-positive must not be controlled.

Model
The model describes the translation of a point-mass similar to the mass of an arm (m = 2.5 kg) in the horizontal plane.The coordinates corresponded to the experiments, such that the first target was directly in the forward reaching direction (along the y-dimension), and the second targets were at the same y-coordinate but displaced in the x-dimension by 8 cm to the right or to the left of the first target.

Dynamics of the effector
The state-space representation of the point-mass (from (40, 57)), with mass 'm', was as follows: ̈= −̇+   + and The system was discretized by using a first order Taylor series expansion over one time-step of  ∶   =  + , and   =  ( is the identity matrix).We used a discretization step of  = 0.01 .
Importantly, the state space representation of body dynamics was augmented with the x and y-coordinates of the two targets T1 and T2 (denoted by  1 * ,  1 * ,  2 * ,  2 * ), and the system was re-written in discrete-time equations as follows: The subscript  is the time-step, and   is a Gaussian disturbance with zero-mean and covariance ∑  =   , where  = 0.2 is the scaling coefficient.

Control policy
The control policy () produces motor commands (  ) as a function of the estimated state of the point-mass (t).The state-estimate is computed using Kalman filtering as a weighted sum of internal model prediction, and the sensory feedback representing the delayed state information ( (−) ).The sensorimotor delays were set at  = 50  in the simulations.
The details of the Kalman filtering can be found in the previous paper (75).Overall, motor commands are computed by a linear control policy: The control gains (  ) determine how much each state variable can influence the motor command, and are computed to minimize a quadratic cost on the behavior (), using standard dynamic programming procedures (Riccatti equations) described in previous literature (57,75).So, the control policy can be formalized as finding the trajectory of the control gains '  ' that minimize the expected costs on the entire sequence (i.e., a holistic plan): Importantly, the control policy from Equation 9, receives information about both targets within a sequence as state feedback information, in addition to the continuous but delayed position, velocity and forces on the end-effector.

Cost function
As described above, the controller design minimizes a quadratic cost function that captures the intended behavior (in this case a two-reach sequence task).The cost function () was defined as a function of the state-estimate (), and the motor commands () used to control the point-mass.In a linear quadratic regulator (LQG), the control problem is to find a control sequence  * which minimizes the expected value of the sum of J(, u).
Model with terminal costs: Typically, in the case of single reaches, the cost function penalizes the distance between the target and the actual position of the end-effector (terminal or trajectory errors), speed, and motor costs.But an efficient reaching movement in the context of single reach does not ensure the efficiency of the sequence as a whole.
Hence, we consider a composite cost function, that penalizes terminal errors between the end-effector and both targets, to optimize the sequence elements as a whole.Additionally, we can encourage swift or slower transfer between sequence elements by applying a cost on the velocity at the moment when the system approaches the first target (i.e., the Equations 11 and 12 penalize the terminal cost on the positional error relative to T1 and T2 at time-steps N1 and N2 respectively.An equal weight ( 1 = 500) was used for both terminal error terms.Equations 13 and 14 penalize the squared velocity at each target, and the last Equation 15 penalizes high motor output.A rapid transfer between two reaches in a sequence can be simulated by removing the penalty on the velocity at the first target.Hence, for this case,  2 = 0.1, whereas a slower sequence involves slowing down significantly at the first target, which correspond to a parameter set to  2 = 1 in the simulations.In all cases, we set  4 =10 -4 as the coefficient on the motor costs term (‖  ‖ 2 ).Qualitatively similar results can be obtained with a broad range of parameter values ( 1 ,  2 ,  3 ,  4 ).
Model with temporal buildup costs: The terminal costs, described above, are sufficient to produce accurate movements with low endpoint errors, while neglecting the trajectory errors before the endpoint.Another possibility is to penalize deviations in the trajectory from the desired endpoint targets throughout the movement duration.Notably, by using temporal buildup of costs, the endpoint errors incur higher cost, while incurring lower but non-zero cost in the movement duration before the endpoints.Such formulation has been recently applied to emulate human reaching control under uncertain environmental dynamics, in the context of elementary reaching movements (40,57).To implement the temporal buildup costs model, the position and velocity terms in Equations 11 -14 were modified as follows: describe the results of the simulations showing that separate or coarticulated sequence production can result from flexible feedback control without implying a different structure in the controller (Figure1A-B).Second, we present the two behavioural experiments in which participants were instructed to perform a sequence of two reaching movements with an intermediate goal (Figure1C-F).Our key manipulation guided by the model simulations was to give them different instructions regarding the pause at the first target, which produced the modulation of sequence production predicted in theory.

Figure 1
Figure 1 Schematic illustration of hypotheses and experiment method.(A) independent control hypothesis where each target is specified one at a time onto the controller (B) flexible feedback control hypothesis where multiple targets are simultaneously considered (C) Experimental setup of two-reach sequences with two different second targets (T2) sharing the first target (T1).Perturbations of 7 N magnitude (rightward or leftward) were applied close to the start location (D) Sequence of events in a typical (example) trial (E) Block design where the right, left-target blocks are randomly interleaved within a set where each block consist of 8 unperturbed and 2 perturbed trials.(+) and (-) represent the direction of applied loads in perturbation trials (F) Experiment protocol for the 'GO' and 'STOP' tasks,with differing demands at the first target.

Figure 2
Figure 2 Simulations of the OFC model in two-reaches sequence task.(A) Schematic illustration of the feedback control policy.Targets influence motor output through control gains, which are themselves derived by minimizing movement costs (B) Illustration of how overlap of control gains can explain coarticulated reaches (top panel), and late influence of second target on first reach (middle).(C) Hand speed (left), control gains corresponding to the two targets (middle), and hand path (middle) when dwell velocity at target 1 was not constrained.(D) same as (C) except the dwell velocity was penalized.
Figure 3 Hand kinematics in the GO task (experiment 1).(A) hand speed and (B) hand path in unperturbed baseline trials.(C, E) same as (A) but for perturbation trials.(D, F) same as (B) but for perturbation trials.(G) Baseline EMG response of PEC and PD muscles (H) Perturbation response (baseline subtracted) of PEC and PD muscle duringrightward and (I) leftward perturbations.Light curves in the hand path panels are average paths of individual participants.Shaded area around the average curves represent the standard error (±sem) across participants around

Following
the perturbation (left or right), the stretch responses were visibly larger in both PEC and PD muscles compared to the baseline response (Figure 3H, I).The perturbation response in the stretched muscle -PEC for rightward load (Figure 3H, left panel) and PD muscle for leftward load (Figure 3I, right panel) -was visibly larger if the second reach involved movement reversal compared to the first reach where the perturbation was in the direction of the second target.Interestingly, even the unloading response (below baseline) in the unstretched (shortened) muscle -PD for rightward load (Figure 3H, right panel) and PEC for leftward load (Figure 3I, left panel) -was larger if the second reach involved direction reversal.Overall, the perturbation responses -both stretch and shortening muscle responses -indicated vigorous feedback corrections against the load when the second target induced movement reversal at the first target, explaining the smaller lateral deviation of the hand when the perturbation was in the direction of the second target.

Figure 4
Figure 4 EMG perturbation responses in the GO task (Experiment 1).(A) PEC muscle response to a rightward perturbation (B) PD muscle response to a rightward perturbation (C) PEC muscle response to a leftward perturbation (D) PD muscle response to a leftward perturbation.Red and blue colors represent sequences to leftward and rightward second targets respectively.Shaded area represents the standard error (±1 sem) across participants around mean.All traces were aligned to perturbation onset.t = 0 ms indicates the time of perturbation onset.*p < 0.05, **p < 0.005, ***p < 0.001 and n.s indicates non-significant effects of one-tailed t-test with p > 0.05.
Figure 5 Hand kinematics in the STOP task (experiment 2).(A) hand speed and (B) hand path in unperturbed baseline trials.(C, E) same as (A) but for perturbation trials.(D, F) same as (B) but for perturbed trials.(G) Baseline EMG response of PEC and PD muscles (H) Perturbation response (baseline subtracted) of PEC and PD muscle during rightward and (I) leftward perturbations.Light curves in hand path panels are average paths of individual participants.Shaded area around the average curves represent the standard error (±sem) across participants around the mean.Gray colored box indicates the average durationof the first reach.Red and blue colors correspond to reaches made towards left and right second targets respectively.t = 0 ms represents the time of perturbation onset.*p < 0.05, **p < 0.005, ***p < 0.001.

3 )
̇=   −   (   ̇=   −   ( 4 ) Dot(s) correspond to time derivatives.The variables   and   are the forces applied by the controlled actuator to the mass, and these forces are a first order response to the actual control vector denoted by variables   and   .The parameter  is a dissipative constant (0.1 Nsm -1 ), Equations 3, 4 capture the first order muscle dynamics with time constant set to 0.1 s, and   represents the external load applied to perturb the point-mass (if any).Defining the state vector as  = [, , , ̇,   ,   ], and the control vector  = [  ,   ], we have ̇=  + , with: A=