Motor planning under uncertainty

Actions often require the selection of a specific goal amongst a range of possibilities, like when a softball player must precisely position her glove to field a fast-approaching ground ball. Previous studies have suggested that during goal uncertainty the brain prepares for all potential goals in parallel and averages the corresponding motor plans to command an intermediate movement that is progressively refined as additional information becomes available. Although intermediate movements are widely observed, they could instead reflect a neural decision about the single best action choice given the uncertainty present. Here we systematically dissociate these possibilities using novel experimental manipulations and find that when confronted with uncertainty, humans generate a motor plan that optimizes task performance rather than averaging potential motor plans. In addition to accurate predictions of population-averaged changes in motor output, a novel computational model based on this performance-optimization theory accounted for a majority of the variance in individual differences between participants. Our findings resolve a long-standing question about how the brain selects an action to execute during goal uncertainty, providing fundamental insight into motor planning in the nervous system.

2 where intermediate movements between potential goals were rendered infeasible, using tasks with high-88 threshold movement speed criteria or with potential targets that had a wide spatial separation 9 . These 89 tasks incentivize direct movements towards a single potential target for success, and thus promote a high-90 level explicit choice between targets before movement onset, rather than low-level motor planning given 91 uncertainty. Such tasks are therefore unlikely to provide mechanistic insight into implicit motor planning. 92 It has been difficult to dissociate between motor averaging and performance-optimization because 93 the motor plan that leads to task success often resembles an average of individual-target motor plans 5 . 94 Indeed, close examination of previous studies that claim to provide support for motor averaging reveals 95 not only that performance-optimization was not considered, but also that the observed results were, in fact, 96 consistent with performance-optimization. For example, Chapman and colleagues (2010) examined motor 97 planning during goal uncertainty by employing a task where participants were asked to make rapid 98 reaching movements towards one of several potential target locations, with the final target cued only after 99 movement onset (i.e., "go-before-you-know"). Analogously, Gallivan and colleagues (2016) studied motor 100 planning at the level of feedback control policies in an analogous go-before-you-know task, but used 101 targets of different widths in order to modulate the gain of feedback responses 35,36 . In both cases, 102 intermediate actions were elicited when the goal was uncertain. In the Chapman study, the resulting 103 movement was directed at the midpoint between the locations of the potential targets, and in the Gallivan 104 study, the resulting feedback gain was sized at the midpoint between the gains associated with the 105 potential targets. Although the authors interpreted these behaviors as evidence for motor averaging, in 106 both cases the results can also be explained by performance-optimization: in the Chapman study, initial 107 movements toward the spatial midpoint brings the hand closer to all potential targets, thus reducing the 108 size of the subsequent movement required to reach the final target once identified, and in the Gallivan 109 study, intermediate feedback gains balance the cost of the effort needed for movements executed with 110 high-feedback gains, against the likelihood that this effort will be necessary. 111 In another example, Stewart and colleagues (2014) used the go-before-you-know paradigm to 112 create goal uncertainty, but in combination with obstacles that were positioned to alter motor planning to 113 one of the potential targets. In trials where the goal was initially uncertain, the obstacle placements resulted 114 in intermediate movements that were deflected in a manner consistent with motor averaging, and this was 115 taken as evidence for the motor averaging hypothesis. However, the obstacles were configured in such a 116 way that the observed deflections on goal-uncertain trials, claimed to result from motor averaging, also 117 happened to improve the safety margin around the obstacle. Therefore, performance-optimization for the 118 goal-uncertain trials, which would result in motor planning that improved safety margins to minimize the 119 likelihood of obstacle collisions, would also readily predict the experimentally observed deflections, calling 120 the evidence for motor averaging into question. 121 The interpretational and methodological issues of the studies outlined above are emblematic of the 122 pervasive confound between motor averaging and performance-optimization in studies examining motor 123 planning under uncertainty. In the current study, we resolve the debate between these hypotheses by 124 designing two different sets of experiments that allow rigorous dissociation of performance-optimization 125 from motor averaging. In one, we employ obstacle-based perturbations of motor planning, like in Stewart 126 et al. 2014, but using novel obstacles that break the congruence between the predictions for performance-127 optimization and motor averaging. In another, we create a novel dynamic environment that induces 128 adaptive responses which make entirely opposite predictions for performance-optimization versus motor 129 averaging. In both cases, we find clear evidence that, when faced with uncertainty, humans form a single 130 motor plan that optimizes task performance. 131 132 RESULTS 133 To decouple motor averaging (MA) from performance-optimization (PO) in both experiments (Expt 134 1 and Expt 2), we created modifications of the popular go-before-you-know task 5,8,9,12,[15][16][17][18] , where 135 uncertainty is introduced on some trials by presenting participants with multiple potential reach targets and 136 disclosing the final goal location only after movement onset. We designed paradigms that altered motor 137 plans in such a way that the MA and PO theories would make contrasting predictions when the reach goal 138 was uncertain. In Expt 1, we accomplished this by pre-training a force-field (FF) environment that physically 139 perturbed 1-target movements to left and right lateral target locations with one FF environment (FF A ), but 140 perturbed movements to a center target location with the opposite FF (FF B  In Expt 1 (n=16), we employed a version of the 'go-before-you-know' task in which a combination 159 of different FF environments was used to investigate movement planning under uncertainty. While gripping 160 a robotic manipulandum that could apply forces to the hand (Fig 1a), participants initiated 20cm cued-161 onset reaching movements towards either a single pre-specified target (1-target trials; Fig 1b, left) or two 162 potential targets (2-target trials; Fig 1b, right). On 1-target trials, the target, located at a left (+30°), right (-163 30°), or center (0°) eccentricity from the midline, was displayed for 1000ms before an auditory go cue was 164 delivered. On 2-target trials, the same pair of potential targets always appeared in the left and right target 165 locations for 1000 ms before the auditory go cue, but one (randomly selected) was extinguished 3cm after 166 movement onset, leaving only the final reach target on-screen. Comparing the data from 1-and 2-target 167 trials thus allowed us to infer how uncertainty about the final target location influences motor planning on 168 2-target trials.  performance-optimization (PO) for feedforward motor planning during uncertainty. Because both potential targets (left 189 and right) were both associated with FFA, MA (purple arrows) predicts a force pattern consistent with FFA on 2-target 190 trials. However, since the initial motion on these 2-target trials is in the direction of the center target, PO (green 191 arrows) predicts the force pattern consistent with FFB, which is opposite the MA prediction.

193
We designed a novel physical environment composed of multiple FFs that would result in very 194 different predictions from MA vs. PO for motor planning on 2-target trials. Specifically, we trained 195 participants to adapt to a viscous curl FF (see methods) that perturbed the left/center/right movements 196 during 1-target trials with a FF A /FF B /FF A composite environment (Fig 1c), where FF B = -FF A , and the sign 197 of the FFs was balanced across participants. For 2-target trials, MA would predict that participants average 198 the force patterns (FF A in both cases) learned for the left and right (lateral) targets, which correspond to 199 the potential target locations on 2-target trials (Fig 1d, left). In contrast, PO would predict that participants 200 produce the force pattern (FF B ) appropriate for optimizing the planned intermediate movement since this 201 movement maximizes the probability of successful target acquisition 5,34 (Fig 1d, right). Importantly, and in 202 contrast to previous studies 5, 12

270
To specifically determine how movements are planned in uncertain conditions, we examined the 271 force patterns predicted by MA and PO for 2-target trials, and compared them to the force patterns we 272 measured on 2-target trials (Fig 2c). Experimental data on 2-target trials show that participants systematically produce positive forces that are 278 at odds with MA-based predictions, but in line with PO-based predictions for motor planning during 279 uncertainty. We quantified the similarity between the 2-target trial data and the predictions using a 280 prediction index (PI) that produces a value of +1 if the mean force from the experimental data were perfectly 281 similar to the PO prediction, -1 if was perfectly similar to the MA prediction, and 0 if the data were halfway 282 between both predictions (see methods). We measured this PI over two intervals: one extended for the 283 duration we could reasonably use to examine feedforward motor output, spanning movement onset 284 through feedback response onset (T RESP ); the other was more conservative, spanning movement onset 285 through target cue onset (T ON ). In both cases, the PI estimates indicate that the observed 2-target trial 286 force pattern is markedly more consistent with the PO than the MA prediction ( would, on-average, be associated with a FF in-between FF A and FF B , rather than a FF comprised solely 300 of FF B . This effect would be greater for movements farther off-target, resulting in a small but definite 301 variability-dependent bias in the FF intended to be associated with a given target. We thus performed an 302 additional experiment (Expt 1-GEN, n=10) where we measured the adaptation associated with a range of 303 "off-target" movements to determine the size of this variability-induced bias (Fig 2e). Specifically, we 304 employed a task design identical to Expt 1, with multi-FF training in the left/center/right target directions, 305 but we replaced 2-target trials with 1-target EC trials and positioned them in directions that enabled a 306 dense sampling for generalization of the trained adaptation (every 7.5° in-between the trained target 307 directions). The resulting adaptation levels, shown in Fig 2e, show that the multi-FF environment 308 generalizes nonlinearly across movement directions, with noticeable changes in adaptation around the 309 trained target directions (a ~34% change 7.5° from the 0° trained target and 38-42% changes 7.5° from 310 the ±30° trained targets).

311
The participant's expected level of adaptation was then used to scale the raw force profiles that comprise the 317 predictions. We found that this refinement reduced the magnitude of the predicted peak forces by 20-25% 318 for both the MA and PO models (see Fig 2c vs 2f). Although the data were clearly more in line with the raw 319 PO prediction than the raw MA prediction, there was still a noticeable mismatch between the PO prediction 320 and the data (Fig 2c,g). However, taking generalization into account results in a refined PO prediction that 321 is in even greater alignment with the data, corresponding to prediction indices that are closer to +1 at both movements performed under uncertainty arise from a single action plan that optimizes task performance.

326
Obstacle avoidance can elucidate the mechanisms for motor planning under uncertainty 327 In a second experiment, we used a different experimental approach to assess whether motor 328 averaging (MA) or performance-optimization (PO) can explain the intermediate movements that arise from 329 motor planning in uncertain conditions. We designed this experiment based on a subtle variation of an 330 influential study that supported motor averaging 8 . However, here we show that the original instantiation of 331 this experiment, which we replicate (Expt 2a), fails to make readily dissociable predictions for MA vs PO, 332 but that a subtle modification of it (Expt 2b) leads to highly dissociable, and in fact essentially opposite 333 predictions for these two models of motor planning under uncertainty. 334 In Expts 2a (n = 8) and 2b (n = 26), participants made 20cm cued-onset reaching arm movements 335 (Fig 3a). We instructed participants to reach towards either a single pre-specified target or two potential 336 targets, as in the 1-and 2-target trial designs from Expt 1 (obstacle-free trials; Fig 3b). After a baseline 337 period in which participants practiced these trial types (see methods), we presented a virtual obstacle 338 between the start position and, for example, the left target (Fig 3c-d). Please note that for simplicity, the 339 subsequent explanations in this section reference a left-side obstacle condition; however, the experimental 340 design was balanced within participants to include an equal number of interspersed right-side obstacle 341 trials. In Expt 2a, we used an obstacle with the same size and positioning as that in Stewart et al. 2014 342 (ref. 8 ). This obstacle protruded 2cm to the right and effectively infinitely far to the left of the direct path 343 between the start position and the left target (i.e., the obstacle-obstructed target), and thus required 344 rightward deflections for left 1-target trials. In Expt 2b, we used a pared-down version of this obstacle that 345 protruded 2cm to the right but 0cm to the left of the direct path between the start position and left target. 346 This allowed for both leftward and rightward travel paths around the obstacle, but promoted less circuitous 347 leftward deflections (see Fig 3c). Accordingly, all eight participants in Expt 2a consistently veered rightward 348 as required around the obstacle, and all twenty-six participants in Expt 2b consistently veered leftward   obstructed and unobstructed targets for motor averaging, whereas the refined prediction permits differential 393 weighting. Analogously, for PO, the original prediction assumes equal weighting for the two determinants of task 394 success: obstacle avoidance and movement timing, whereas the refined prediction permits differential weighting. For  Specifically, a model for PO would have two objectives on obstacle-present 2-target trials: (1) to reach the 433 final target within the required timing criteria and (2), to avoid obstacle collision, as these two objectives 434 form the basis of task success (see methods). We therefore modeled the PO predictions as an average of 435 the movement deflections that would arise if PO were to independently prioritize each objective, expressing 436 the PO prediction as an equal balance of the two motor costs associated with the determinants of task 437 performance. Prioritization of movement timing, or rapid target acquisition, would predict a movement 438 direction midway between the two potential targets (i.e., a net 0° deflection), as this would maximize the 439 probability of successful target acquisition during uncertainty 34 . However, prioritization of obstacle 440 avoidance would predict deflections that incorporate a safety margin around the obstacle that is 441 proportional to an internal estimate of variability 42 . To determine the expected safety margin for obstacle-442 present 2-target trials, we therefore scaled the magnitude of the safety margin observed on obstacle-443 obstructed 1-target trials by the ratio of their observed variabilities, 444 where ̂2 again represents the predicted mean deflection, or safety margin, on obstacle-present 2-target 446 trials for each participant , 2 and 1 represent each participant's observed variability on obstacle-447 present 2-target trials and obstacle-obstructed 1-target trials respectively, and 1 again represents each 13 participant's observed mean deflection, or safety margin, on obstacle-obstructed 1-target trials. Performance-optimization predicts motor planning for obstacle avoidance 456 Like in Expt 1, we focused our analysis on the initial portion of motor output, measured as the initial 457 movement direction (IMD), as this reflects feedforward motor planning. We calculated the IMD as the 458 direction of the hand at the midpoint of the movement relative to the direction of the hand at movement 459 onset (but note that we obtained qualitatively similar results for all analysis when the IMD was calculated 460 4cm into the movement). Each 2-target trial IMD was subsequently assigned a positive polarity if a 461 deflection away from the obstacle occurred, and a negative polarity if a deflection towards the obstacle 462 occurred. To facilitate a fair comparison between obstacle-free and obstacle-present 2-target trial IMDs 463 (Fig 4b), the polarity of obstacle-free 2-target trials was arbitrarily assigned from one trial to the next. 464 Correspondingly, on obstacle-present 2-target trials, we found that participants displayed IMDs that were 465 consistently biased away from the obstacle compared to baseline obstacle-free IMDs in both Expt 2a (6.3 466 ± 4.3° vs -2.1 ± 2.4° [mean ± 95% CI], p = 7.90 × 10 -3 , t(7) = -3.16), and Expt 2b (3.6 ± 0.77° vs -0.2 ± 0.6°, observed IMDs were closer to the PO prediction than the MA prediction by a large margin in Expt 2b (see 471 Fig 4b and Fig 5a, 5d). The worsened 502 prediction for the MA model produced by this weighting refinement indicates that the assumption for equal 503 weighting present in the original model was not the reason that it failed to explain the experimental data. 504 505 Performance-optimization predicts individual differences in obstacle avoidance 506 Panels 5b and 5e illustrate the relationship between the 2-target trial IMD, 2 , and the obstacle-507 obstructed 1-target trial IMD, 1 , the latter of which was determined as the IMD relative to the angular 508 protrusion of the obstacle into the direction of the observed deflection (i.e., the safety margin). We 509 specifically plotted these variables because the MA and PO models both predict relationships between 510 them (see Eq. 1-2), however, in both cases, the relationship between the predicted 2-target trial IMD, ̂2 , 511 and 1 is not a singular one. In the MA model, ̂2 also depends on 1 , and in the PO model, ̂2 also 512 depends on 2 1 . Correspondingly, these panels show the raw data of 2 (black circles) as well as versions 513 of 2 that correct for the effects of 1 (purple circles) to evaluate the MA model predictions, and for the 514 effects of 2 1 (green circles) to evaluate the PO model predictions. We found that 2 , when corrected for 515 1 , did not display a consistent positive correlation with 1 , as the correlation was positive for the data 516 from Expt 2a (r = +0.91, p = 1.51 × 10 -3 , t(6) = 5.49; 1-sample t-test) but not for the data from Expt 2b (r = 517 -0.22, p = 0.28, t(24) = -1.11; 1-sample t-test). In contrast, we found that 2 , when corrected for 2 1 , 518 consistently displayed a positive correlation with 1 (Expt 2a: r = +0.88, p = 3.70 × 10 -3 , t(6) = 4.60; Expt 519 2b: r = +0.71, p = 4.89 × 10 -9 , t(24) = 4.94; 1-sample t-test). This suggests that performance-optimization 520 can explain not only mean behavior during obstacle avoidance, but also differences between one individual 521 and another. 522 To better understand the ability of the PO model to predict the individual differences present in 2 , 523 we next sought to rigorously examine the importance of individuating each predictor, indicate that individuating the obstacle-obstructed 1-target trial IMD 1 is far superior to using the 531 population-averaged IMD ̅ 1 in explaining 2 , whereas a value of 0 would indicate the reverse. We fit this 532 model onto the pooled aggregate of datasets from Expt 2a and 2b, and found that it was able to explain 533 79.5% of the variance for individual differences in 2 , with values of 0.71 and 0.63 for and , 534 respectively, indicating that We designed two novel experimental paradigms that powerfully dissociated between the 573 hypotheses proposed to underly motor planning under uncertainty: motor averaging (MA) and 574 performance-optimization (PO). In Expt 1, we designed an environment that physically perturbed motion 575 in the direction of potential target locations off-course, and motion in the direction of intermediate 576 movements oppositely off-course. Participants readily adapted to this composite environment on 1-target 577 trials. Critically, on trials with two potential targets, participants displayed motor output strongly aligned with 578 adaptive responses for intermediate movements, which was consistent with the PO prediction, but grossly 579 opposite the MA prediction for motor planning under uncertainty (see Fig 2). An effort to refine the model 580 predictions by taking the observed spread of movement directions into account, resulted in even greater 581 alignment between the observed motor output and the PO prediction. 582 In Expts 2a-b, we replicated (Exp 2a) the paradigm from a well-known study that provided support 583 for MA 8 but did not qualitatively dissociate the predictions for MA and PO, and then made a small 584 modification (Exp 2b) that allowed for MA and PO to be powerfully dissociated. Specifically, in Expt 2a, we 585 positioned an obstacle so that movements to one of the potential targets would be skewed in a direction 586 consistent with increasing the safety margin for intermediate movements during uncertainty. Thus, 587 qualitatively, the PO prediction based on creating an appropriate safety margin for intermediate 588 movements during uncertainty, and the MA prediction based on averaging the motor plans for potential 589 targets, were in the same direction. But quantitatively, we found the experimental results to be significantly 590 closer to the PO predictions than the MA predictions. In Expt 2b, we altered the obstacle from Expt 2a so 591 that movements to one of the potential targets would be skewed in a direction opposite of that needed for 592 increasing the safety margin for intermediate movements during uncertainty, resulting in both qualitative 593 and quantitative differences in the predictions for MA and PO. Experimentally, we found that motor output 594 during trials with two potential targets consistently increased the safety margin for intermediate movements 595 in accordance with the PO prediction, but was grossly opposite in direction to obstacle-induced changes 596 in the MA prediction. Subsequent refinement of the MA model to allow for different weightings of the motor 597 plans associated with the obstacle-obstructed and unobstructed targets did not improve its prediction. On 598 the other hand, the PO model accurately predicted the population averaged changes in motor output for 599 both experiments as well as a remarkable ~80% of the variance for individual differences between 600 participants, suggesting that internal estimates of variability and uncertainty can be critical for motor 601 planning. Collectively, our results provide clear evidence that humans generate a single motor plan that 602 reflects optimization of task performance, rather than averaging over multiple potential plans.  attempted to dissociate whether motor output might reflect an average of sensory or motor representations 611 of movement plans (i.e., sensory averaging vs motor averaging) by using obstacles to perturb these motor 612 plans, but did not dissociate either sensory or motor averaging from performance-optimization. 613 Another study, by   16 , applied a visuomotor rotation (VMR) to movements 614 toward one potential target to perturb motor planning without affecting the sensory representation of the 615 target. This resulted in initial movement directions (IMDs) that were altered during 2-target trials in 616 accordance with the perturbed motor plans, and thus in line with the prediction for motor, rather than 617 sensory, averaging. However, PO predicts an IMD identical to that predicted by MA: Since the imposed 618 VMR shifts the final hand position associated with acquisition of the potential target to which it was applied, 619 a corresponding shift in the IMD that accounts for this VMR perturbation would optimize performance by 620 minimizing the cost of corrective movements following disclosure of the final target location. In another study, performed by Nashed and colleagues (2017) 35 , motor planning under uncertainty 639 was studied using a task that was based on grip force (GF) control. Specifically, participants grasped an 640 object, capable of measuring GF, and made reaching movements while environmental dynamics were 641 experimentally manipulated with robotically applied load forces that affected the required GF. This task 642 design was used to attempt for a dissociation of MA from PO, however, although GF is known to be 643 substantially more sensitive to the variability in environmental dynamics than to the mean dynamics 42 , this 644 study examined the MA and PO hypotheses for motor planning under uncertainty using predictions that 645 were based entirely on the mean environmental dynamics. Taken together with the lack of information 646 provided about environmental variability and its effect on required GFs, it is unlikely that this study can 647 shed light on motor planning under uncertainty.

649
Neural representations of motor planning under uncertainty 650 Studies suggesting motor or sensory averaging have been motivated by reports of simultaneous 651 deliberation of competing potential goals in sensorimotor brain areas (i.e., parallel planning) 1,8,12,18,33 . These 652 studies argue that the motor plans prepared in parallel are averaged, resulting in the intermediate 653 movements observed when uncertainty is present during motor planning. The neural evidence for parallel 654 planning is based on studies of delay period activity associated with motor planning when multiple potential 655 targets were present 28,29,31,32 . However, because this activity was measured using single-electrode 656 recordings, only a small number of cells could be recorded simultaneously, and thus data from same-type 657 trials were aggregated to make population-based estimates of the planned motion. The results suggested 658 that this aggregate delay period activity was tuned to both potential targets in sensorimotor areas, in 659 particular dorsal premotor cortex (PMd) and parietal reach region (PRR), in monkeys. However, the 660 observed tuning could arise from simultaneous parallel planning for both potential targets, as the authors 661 argued, or from planning-related activity associated with one target on some trials and with the alternate 662 target on other trials. 663 Recently, Dekleva and colleagues (2018) 43 tested for parallel planning using an electrode array to 664 record simultaneously from 100-160 neurons in PMd so that planning-related neural activity could be 665 examined for individual movements when two potential targets were present. Critically, they found that 666 neural activity during the delay period was consistent with motor planning directed for either one target or 667 the other on individual trials, rather than with parallel planning for both potential targets. While these results 668 cannot rule out parallel planning in other brain areas, they call into question the evidence for parallel 669 planning from previous studies that relied on trial-aggregated data. 670 In summary, the current findings indicate that motor planning during uncertain conditions does not 671 proceed from averaging parallel motor plans, but instead, incurs the creation of a single motor plan that 672 optimizes task performance given knowledge of the current environment. These findings are compatible 673 with the current neurophysiological data and offer a mechanistic framework for understanding motor 674 planning in the nervous system .  675  676  677  678  679  680  681  682  683  684  685  686  687  688  689  690  691  692  693  694  695  696  MATERIALS AND METHODS  697  698 Participants 699 Twenty-six participants (twenty-five right-handed; 15 female; age range 18-42) performed the multi-700 force-field (multi-FF) adaptation experiments, with sixteen participants in Expt 1 and ten participants in 701 Expt 1-GEN. Thirty-four participants (thirty-two right-handed; 18 female; age range 18-33) performed the 702 obstacle avoidance experiments, with eight participants in Expt 2a and twenty-six participants in Expt 2b. 703 Participants were assigned to experiments based when they responded to advertisements their availability 704 for scheduling. When different experiments were running concurrently, participants were randomly 705 assigned for participation. The sample sizes used for each experiment were determined based on pilot 706 data and existing literature 8,38,39 . All participants used their right hands to perform the experiments, were 707 naïve to the purpose of the experiments, and were without known neurological impairment. The study 708 protocol was approved by the Harvard University Institutional Review Board, and all participants provided 709 written informed consent. 710 711 Experiment Protocols 712 Apparatus for multi-force-field adaptation experiments (Expt 1 and Expt 1-GEN) 713 Participants were instructed to grasp the handle of a two-joint robotic manipulandum with their right 714 hands and make rapid 20cm point-to-point reaching arm movements in the horizontal plane to either a 715 single pre-specified target (1-target trials) or two potential targets (2-target trials). All visual information, 716 including veridical feedback of hand motion provided in the form of a white 3mm-diameter cursor, was 717 displayed on a vertically oriented LCD computer monitor (refresh rate of 75 Hz). Participants were 718 positioned such that their midlines were aligned with the middle of the monitor, and their right arms were 719 always supported with a ceiling-mounted sling. The manipulandum measured hand position, velocity, and 720 force, and its motors were used to dynamically apply prescribed force patterns to the hand, all of which 721 were updated at a sampling rate of 200Hz.

723
Targets and feedback (Expt 1 and Expt 1-GEN) 724 On 1-target trials, the target was located at a left (+30°), right (-30°), or center (0°) eccentricity from 725 the midline, and on 2-target trials, a pair of potential targets always appeared at the left and right target 726 locations. Participants were instructed to initiate a trial by moving the cursor to a start position (green filled-727 in circle, 10mm-diameter), after which the target or targets (yellow hollow circles, 10mm-diameter) were 728 presented. 1000ms after target presentation, an auditory go cue signaled participants to initiate a 729 movement. Movement onset was subsequently determined online as the time when the hand velocity 730 exceeded 5cm/s or the time when the hand traveled 3cm from the start position, whichever occurred first. 731 Participants were required to initiate movements after the onset of the go cue, but no later than 425ms 732 after the go cue finished playing. If movement onset was detected outside these bounds, a message that 733 either read 'Too Soon!' or 'Too Late!' was displayed above the start position, and was accompanied with 734 a unique tone. Furthermore, because pilot data showed that participants may sporadically stop and initiate 735 a discrete reach to the final target immediately following its disclosure on 2-target trials, we also required 736 participants to maintain their velocity throughout the first half of every movement (i.e., while the 737 displacement was less than 10cm). Specifically, during each trial, if the instantaneous maximum velocity 738 20 declined more than 33% during the first half of the movement, we played a unique buzzer tone. If any of 739 these requirements were not fulfilled, the trial was discarded. 740 For the 2-target go-before-you-know trials, the final target (randomly selected on each trial) was 741 filled in with yellow, and the distractor target was simultaneously extinguished once a 3cm displacement 742 between the hand and start position was achieved. For consistency, on 1-target trials, the target also filled 743 in with yellow at the same point in the movement. After the hand reached the final target, we provided 744 performance feedback based on the movement time, determined as the time interval between movement 745 onset, as defined above, and movement offset, defined as the first timepoint when the hand was both 746 within 6mm of the final target, and the hand speed in the subsequent 300ms period was below a threshold 747 of 6.35cm/s. Following movement offset, visual and auditory feedback were presented by changing the fill 748 color of the target and playing a sound, depending on whether the movement time was faster than (red fill-749 in color, buzzer tone), within (green fill-in color, chirp tone), or slower than (blue fill-in color, buzzer tone) 750 the required interval. This movement time interval was based on thresholds that were adjusted online per 751 participant as described below. After feedback was delivered, the robotic manipulandum guided 752 participants' hands back to the start position. Participants completed blocks of trials throughout the 753 experiments but were allotted 1min rest breaks in-between each block (see training schedule details 754 below).

756
Movement time thresholds 757 We used data-driven updating for specifying the movement time thresholds. The "too-slow" 758 threshold was set at the 70 th percentile of the of the movement times for last 18 trials of the same type (1-759 target or 2-target). Thus, separate thresholds were maintained for 1-target vs 2-target trials. The 1-target 760 trial movement time threshold ranged from 225 to 585ms in Expt 1 and 225 to 608ms in Expt 1-GEN (on 761 average across participants). In Expt 1, the 2-target trial movement time thresholds ranged from 225 to 762 655ms. We used this 70 th percentile updating to individualize the movement time thresholds for different 763 participants because we found that, in pilot data, individuals with a large fraction of "too-slow" feedback 764 sometimes exhibited erratic, seemingly exploratory behavior on 2-target trials, and abandoned 765 intermediate movements. Individuals with a smaller fraction of "too-slow" feedback, however, did not. 766 Intermediate movements are of fundamental interest for the examination of motor planning during 767 uncertainty as these movements reflect low-level implicit motor planning 16 , but unfortunately, abandonment 768 of intermediate movements has remained an issue in studies with standard implementations of go-before-769 you-know tasks, consequently leading to striking data exclusion criteria (e.g., removal of 7%-50% of 770 participants in previous work 5, [12][13][14][25][26]  in counterclockwise (CCW) FFs. We balanced the directions of the applied FFs across participants so that 788 half experienced the multi-FF environment with = +1 for left target or right target trials (e.g., see FF A in 789 Fig 1c) and = −1 for center target trials (e.g., see FF B in Fig 1c), and the other half experienced the multi-790 FF environment with = −1 for left target or right target trials and = +1 center target trials. Data 791 associated with each target were then combined from each subgroup.

793
Error clamp and partial error clamp trials 794 Because actions made during reaching movements may result from both feedforward motor 795 planning or online feedback corrections to movement errors, we used error clamp (EC) trials to restrict 796 deviations from the straight-line path towards the target during 1-target trials. We implemented these EC 797 trials as a highly stiff (6000 N/m), viscous (250 Ns/m) one-dimensional spring and damper system in the 798 direction orthogonal to the straight-line path between the initial hand position and the cued target. In line 799 with previous work, these EC trials effectively eliminated movement errors (average maximum absolute 800 deviation, <1.9mm), and allowed for a high-accuracy measurement of feedforward participant-produced 801 forces patterns 37,39 . 802 Unlike 1-target trials, feedback corrections are to be expected during 2-target trials if task success 803 is to be achieved, because participants must reflexively correct their movements following divulgence of 804 the ultimate target. We thus devised a variant of the EC, which we term the partial error clamp (pEC), to 805 measure the initial segment of force output during 2-target trials that reflects feedforward motor planning 806 before these feedback corrections occur. As the initial motion on these 2-target trials was directed towards 807 the center target, we aligned the pECs to the straight-line path to the center target, but we smoothly 808 transitioned the movement from the highly stiff and viscous environment into a null environment (i.e., the 809 robot motors were disabled) after 11cm. Note that the 11cm point was selected based on an analysis of 810 pilot data to determine the onset of feedback corrections to the final target. These pEC trials allowed us to 811 measure of feedforward force patterns early in the movement, while motor errors were minimized (average 812 maximum absolute deviation, <2.3mm), but still permitted participants to carry out feedback corrections 813 later in the movement for final target acquisition. Post-hoc surveys indicated that 5/16 participants noticed 814 pECs, whereas 6/16 participants noticed the ECs.

816
Training schedules (Expt 1 and Expt 1-GEN) 817 We divided the experiment into baseline, training, and test epochs with a total of 1305 outward 818 reaching movements. The experiment began with the baseline epoch, which consisted of nine blocks. The 819 first three blocks were comprised of 120 null 1-target trials that familiarized participants with the basic 820 experimental setup and feedback structure described above. The next four blocks were comprised of 200 821 2-target trials, the first 50 of which were null trials, and the remaining 150 were 80% null trials and 20% 822 pEC trials. The last three blocks of the baseline period reacquainted participants with 1-target trials before 823 the training period started, and were comprised of 85 1-target trials, in which 80% were null trials and 20% 824 were EC trials. The force patterns measured on EC and pEC trials throughout this epoch were used as a 825 baseline for estimating learning-related changes in force patterns during subsequent blocks. Note that in 826 this epoch, all blocks comprised of 1-target trials probed each target direction in equal amounts. 827 The baseline epoch was followed by the training epoch, which was separated into seven blocks 828 and was comprised exclusively of 1-target trials. interspersed in a pattern that was random (frequency of 1 in 5 during baseline and training and 1 in 4 852 during test) but which avoided consecutive EC/pEC trials to prevent decay. 853 The training schedule of Expt 1-GEN was analogous to that of Expt 1, but 2-target trials were 854 replaced with 1-target trials that were positioned at 1 of 9 different directions (from -30° to 30° every 7.5°).

855
Thus the training epoch of Expt 1-GEN was identical to that of Expt 1, but during the baseline epoch, 856 participants reached towards each of the nine targets, presented in random order, for an equal number of 857 trials. Baseline force patterns associated with each target were probed with ECs on 20% of trials after the 858 third baseline block as in Expt 1. During the test epoch, the 2-target pEC trials from Expt 1 were replaced 859 with 1-target EC trials that probed generalization to the targets that were positioned in-between the trained 860 target directions (i.e., these trials probed the targets located at ±22.5°, ±15°, and ±7.5°).

862
Apparatus for obstacle avoidance experiments (Expt 2a and Expt 2b) 863 Participants were instructed to grasp a lightweight plastic handle that sheathed a digital stylus and 864 make reaching movements with their right hands in the horizontal plane. We instructed participants to slide 865 the handle across the surface of a tablet capable of recording hand position at 200Hz with a resolution of 866 0.01mm. All visual stimuli, including targets, obstacles, and a real-time cursor showing hand position, were 867 displayed on a horizontally oriented LCD computer monitor (with a screen refresh rate of 120Hz and a 868 motion display latency of ~25ms) that was mounted above the tablet at the shoulder level and therefore 869 obstructed view of the hand. Participants were positioned such that their midlines were aligned with the 870 middle of the monitor and tablet.

872
Design of obstacles (Expt 2a and Expt 2b) 873 In Expts 2a and 2b, participants made reaching movements using the 1-target and 2-target trial 874 configurations from Expt 1, but on some trials, we presented visual obstacles that we instructed participants 875 to avoid. As illustrated in Fig 3c, the obstacle in Expt 2a was rectangular in shape (width 1cm and length 876 12cm) and was oriented so that its long axis was perpendicular to the vector between the start position 877 and the target. Moreover, it was positioned midway between the start location and the target, and from this 878 location, protruded 2cm towards the midline and 10cm away from it so that movements around the obstacle 879 would be consistently deflected towards the midline. In Expt 2b, we modified the obstacle from Expt 2a by 880 clipping off the 10cm away-from-midline protrusion, so that it still protruded 2cm towards the midline, but 881 now 0cm away from it (see Fig. 3c thresholding procedure was identical to Expt 1, but with separate thresholds maintained for 2-target trials, 892 obstacle-obstructed 1-target trials, and all remaining 1-target trials (i.e., all obstacle-free 1-target trials and 893 all obstacle-present 1-target trials in which the obstacle did not directly block the target). We did not 894 maintain separate thresholds for obstacle-free vs obstacle-present 2-target trials because pilot studies 895 indicated that the differences in movement completion times were small. The 2-target trial movement time 896 thresholds ranged from 225 to 645ms in Expt 2a and 225 to 598ms in Expt 2b (on average across 897 participants). The obstacle-obstructed 1-target trial movement time threshold ranged from 225 to 450ms 898 in Expt 2a and 225 to 360ms in Expt 2b. The movement time threshold for the remaining 1-target trials 899 ranged from 225 to 380ms in Expt 2a and 225 to 382ms in Expt 2b. After participants reached the final 900 target, they were instructed to move the handle back to the starting position to begin the next trial. 901 Note that all participants in both experiments completed an equal number of trials in which the 902 obstacle was positioned between the start target and the left target (left-side obstacle condition) and 903 between the start target and right target (right-side obstacle condition). Experiments were, therefore, 904 balanced within participants to cancel out any target-specific effects that might lead to biases in movement 905 direction. Fig 3 displays  (460 trials total), and was designed to familiarize participants with the basic task and feedback structure, 915 before obstacle-present 2-target trials were presented in the test epoch. The first two blocks were 916 comprised of 120 obstacle-free 1-target trials, and the next two blocks were comprised of 100 obstacle-917 present 1-target trials. Of the five blocks that followed, which included a total of 240 trials, three were 918 comprised of obstacle-free 1-and 2-target trials (with 40% 1-target trials and 60% 2-target trials, 180 trials 919 total) and the remaining two blocks were comprised solely of obstacle-present 1-target trials (60 trials total). 920 Note that across all baseline epoch trials, 1-target trials probed each target direction in equal amounts. 921 The test epoch included six blocks (300 trials total), and was designed to probe motor planning for 922 obstacle-present 2-target trials so that predictions for MA and PO could be compared. All blocks in this 923 epoch were comprised solely of obstacle-present trials, with 60% obstacle-present 1-target trials and 40% 924 obstacle-present 2-target trials. Of the obstacle-present 1-target trials, we probed movements towards the 925 obstacle-obstructed target more often (50% of trials) because our analyses were more sensitive to 926 movements towards this target compared to movements towards the center or unobstructed targets. The 927 remaining 50% of obstacle-present 1-target trials probed the center and unobstructed targets in equal 928 amounts. Note that, in both the baseline and test epochs, left-side and right-side obstacle conditions were 929 separated into different blocks, and the ordering of these blocks was balanced across participants.

931
Analysis 932 933 Outlier Analysis 934 No participants were excluded from any dataset. Individual movements that did not comply with the 935 task requirements, outlined in Targets and feedback (Expt 1 and Expt 1-GEN), were not eligible for analysis 936 (<3% of trials in Expt 1, <2% of trials in Expt 1-GEN, <1% of trials in Expt 2a, and <2% of trials in Expt 2b). 937 In addition, we discarded a small fraction of highly atypical movements based on two key features. For all 938 experiments, we required that the movement time was between 225ms and 2000ms (<2% of trials in Expt 939 1, <1% of trials in Expt 1-GEN, <1% of trials in Expt 2a, and <1% of trials in Expt 2b) and for Expt 1 and 940 Expt 1-GEN, we also required that peak velocity was between 0.2m/s and 1m/s (<1% of trials in Expt 1 941 and <1% of trials in Expt 1-GEN). Note that only movements performed after the familiarization blocks in 942 each experiment were used for analysis, and for those movements, these criteria collectively resulted in 943 the omission of <3% of trials in Expt 1, <2% of trials in Expt 1-GEN, <1% of trials in Expt 2a, and <2% of 944 trials in Expt 2b.

946
Analysis of force patterns in Expt 1 947 We examined the lateral force profiles participants produced that were orthogonal to the cued target 948 direction for 1-target EC trials and to the center target direction for 2-target pEC trials, corresponding to 949 the axis of the imposed perturbations (see Multi-force-field environment). We aligned all force profiles to 950 the onset of the target cue (T ON , 40-50ms after movement onset), and used the population-averaged force 951 profiles measured during the test period, after participants became acclimated to the mutli-FF environment, 952 to construct the MA and PO predictions. Specifically, we constructed the MA prediction by averaging the 953 force profiles associated with the left and right targets, and we constructed the PO prediction by directly 954 using the force profiles associated with the center target (Fig 2c). Since we sought to isolate the 955 feedforward component of the data and predictions, before feedback responses to the target cue occurred, 956 we analyzed all force profiles until the minimum time (across participants) that differences in force output 957 on left and right cued 2-target pEC trials were significantly different from zero (T RESP , ~150ms after T ON ). 958 In addition, because the imposed FF environment was velocity-dependent, and adaptive responses to 959 velocity-dependent dynamics are known to be scaled by movement velocity from one trial to the next 44 , we 960 normalized each force profile by the velocity-dependent level of ideal compensation. 961 For a simple determination of how participants compensated for the multi-FF environment we imposed, 962 we characterized the adaptive response on 1-target trials with an adaptation coefficient (AC), calculated 963 as the slope from a linear regression of the baseline-subtracted force profiles participants made during EC 964 trials onto the ideal compensatory force 38,39,42 . For trials that were associated with the FF perturbations 965 imposed during movements towards the center target, we defined the AC so that full FF compensation 966 would yield an AC of +1. For trials that were associated with the FF perturbations imposed during 967 movements towards the left or right target, we defined the AC so that full FF compensation would yield an 968 AC of -1.

969
To quantify the similarity between the 2-target trial force data and the predictions, as shown in Fig  970 2g, we devised a prediction index (PI) that results in a value of +1 if the 2-target trial data is perfectly similar 971 to the PO prediction, -1 if it is perfectly similar to the MA prediction, and 0 would if the data were halfway 972 between both predictions, 973 where and correspond to the common and differential modes, respectively, of predicted mean force 976 levels based on the PO ( ) and MA ( ) predictions, and 2 corresponds to the mean force level of the 977 2-target trial data. We calculated the PI over two intervals: one spanned movement onset until T ON , and 978 the other spanned movement onset until T RESP . 979 980

Refinement of predictions based on generalization of adaptive responses (Expt 1 and Expt 1-GEN) 981
Due to non-trivial variability in motor output, participants occasionally deviated from the intended 982 target direction on 1-target trials and from the center target direction on 2-target trials. Directional 983 deviations consequently bias both the MA and PO predictions since adjacent targets were associated with 984 different FFs in the composite environment we designed. To account for this variability-induced effect and 985 refine our predictions, we first measured how the multi-FF environment generalizes to nine different 986 movement directions in Expt 1-GEN (see Training schedules (Expt 1 and Expt 1-GEN)). We then estimated 987 the generalization of adaptation throughout our composite environment by fitting the population-averaged 988 ACs (from the test period) associated with every probed target direction onto a model that was based on 989 the additive combination of Gaussians centered around the trained target directions (+30°/0°/-30°), There are four free parameters: is the width of each Gaussian, 1 and 2 are the heights of the 993