Representation of continuous hand and arm movements in macaque areas M1, F5, and AIP: a comparative decoding study

Objective. In the last decade, multiple brain areas have been investigated with respect to their decoding capability of continuous arm or hand movements. So far, these studies have mainly focused on motor or premotor areas like M1 and F5. However, there is accumulating evidence that anterior intraparietal area (AIP) in the parietal cortex also contains information about continuous movement. Approach. In this study, we decoded 27 degrees of freedom representing complete hand and arm kinematics during a delayed grasping task from simultaneously recorded activity in areas M1, F5, and AIP of two macaque monkeys (Macaca mulatta). Main results. We found that all three areas provided decoding performances that lay significantly above chance. In particular, M1 yielded highest decoding accuracy followed by F5 and AIP. Furthermore, we provide support for the notion that AIP does not only code categorical visual features of objects to be grasped, but also contains a substantial amount of temporal kinematic information. Significance. This fact could be utilized in future developments of neural interfaces restoring hand and arm movements.


Introduction
The ability to perform dexterous hand manipulations is crucial for a person's well-being (Anderson 2004). In order to restore the skill of everyday hand movements in patients who have lost this ability due to spinal cord injuries, loss of limbs or other motor diseases, the development of neuroprostheses could be very beneficial. The goal of these devices is to translate neural activity from brain areas involved in movement generation into motor output signals to control an external actuator as a hand and arm replacement. Initial successful attempts have been realized in both humans (Hochberg et al 2012, Collinger et al 2013, Aflalo et al 2015 and monkeys (Wessberg et al 2000, Carmena et al 2003, Lebedev et al 2005, Velliste et al 2008 where neural activity from primary motor cortex (M1) and other cortical areas was used to control a robotic arm and hand, which enabled the subject to successfully perform several reaching and grasping tasks.
On the cortical level, the control of visually guided hand movements requires a visuomotor transformation to convert visual information of an object to be grasped or manipulated into a movement plan that can be executed. Several areas in the brain have been identified to be involved in this visuomotor transformation (Jeannerod et al 1995). The anterior intraparietal area (AIP) in the parietal lobe is specialized for the visual guidance of hand movements (Gallese et al 1994, Murata et al 2000, Baumann et al 2009. It receives visual information from areas like the caudal intraparietal sulcus (Sakata et al 1995, Sakata et al 1999 and extracts visual and spatial characteristics of graspable objects. The information is shared with the rostral part of the inferior premotor cortex (area F5) (Sakata et al 1995, Sakata et al 1997. In area F5, a movement plan is selected based on the information received from AIP (Murata et al 1997) and passed on to the primary motor cortex (M1) for execution. From there, cortico-spinal projections activate the spinal cord that excites the muscles (Bennett and Lemon 1996, Holdefer and Miller 2002, Morrow and Miller 2003, Rathelot and Strick 2006.
More precisely, F5 has been suggested to encode specific grip types for grasping an object (Rizzolatti et al 1987, Rizzolatti et al 1988, Murata et al 1997, Kakei et al 2001, Raos et al 2006, Stark et al 2007, Fluet et al 2010 rather than specific joint or muscle commands for executing a grip Luppino 2001, Umilta et al 2007). Instead, joint angle and muscle representations were reported to be encoded in primary motor cortex (Thach 1978, Bennett and Lemon 1996, Holdefer and Miller 2002, Morrow and Miller 2003, Rathelot and Strick 2006, Umilta et al 2007 together with more abstract kinematic features like movement direction (Thach 1978, Georgopoulos et al 1986, Ashe and Georgopoulos 1994, Kakei et al 1999, position of the wrist (Thach 1978, Ashe and Georgopoulos 1994), and force (Cheney and Fetz 1980, Taira et al 1996, Ashe 1997. Therefore, many studies investigating the possibilities to decode hand and arm movements from neural activity have focused on utilizing neural activity from M1 (Ben Hamed et al 2007, Vargas-Irwin et al 2010, Orsborn et al 2014 and F5 for their predictions (Bansal et al 2012, Aggarwal et al 2013. Most of these studies recorded neural activity from rostral regions of M1. Rathelot and Strick (2009) discovered that M1 can be partitioned into two subdivisions with different influences on motor output: one part (termed 'old' M1 and located in the rostral region of M1) contains cells that contact spinal interneurons and that are therefore considered to have a mainly indirect influence on motor commands. In contrast, neurons making direct, monosynaptic connections with motoneurons in the spinal cord (CM cells) are almost exclusively located in the caudal region of M1 (termed 'new' M1), and these cells might be particularly relevant for dexterous finger movement control (Rathelot and Strick 2009). Recordings from 'new M1' will likely include a larger fraction of neurons located in the neighborhood of CM cells that are strongly selective for finger movements and grip types, e.g. as observed in Schaffelhofer et al (2015a).
Although AIP is believed to carry information about visual properties of the object to be grasped (Jeannerod et al 1995, Sakata et al 1995, Sakata et al 1997, Sakata et al 1999, Murata et al 2000, there is accumulating evidence that units in AIP also carry information about grip types and hand movement (Baumann et al 2009, Aflalo et al 2015.
Inactivation of area AIP in monkeys results in deficits of hand preshaping and mismatch of hand orientation when grasping an object (Faugier-Grimaud et al 1978, Gallese et al 1994, impairments that are very similar to those reported after inactivation of F5 (Fogassi et al 2001), and AIP neurons have been reported to become active during movement execution (Gallese et al 1994, Jeannerod et al 1995, Sakata et al 1995, Sakata et al 1997, Gardner et al 1999. Cells in AIP were found to be tuned to grip type and target position in space (Jeannerod et al 1995, Murata et al 2000, Baumann et al 2009, Lehmann and Scherberger 2013. It was suggested that activity in AIP could already reflect a motor plan based on visual and somatosensory information (Murata et al 2000, Debowy et al 2001 instead of solely extracting visual cues relevant for grasping.
So far, only very few studies have investigated the suitability of AIP for decoding grasping features: Townsend et al (2011) as well as Lehmann and Scherberger (2013) were able to decode two different grip types with an accuracy of ∼70% and 75% (averaged), respectively. However, the result was considerably lower than for decoding from area F5 (performance: >90%). A similar result was obtained by Schaffelhofer et al (2015a) who predicted twenty different grip types with both F5 and AIP and obtained higher decoding accuracy with F5 than with AIP. In contrast, decoding of grip type together with object position in space or object orientation could be performed with higher accuracy from signals in AIP than F5 (Townsend et al 2011, Lehmann andScherberger 2013). Furthermore, signals from posterior parietal cortex (PPC) such as the parietal reach region and area 5d have been used to predict reach kinematics in 3D space (Wessberg et al 2000, Hauschild et al 2012, Aflalo et al 2015. In this study, we recorded spiking activity simultaneously from the three cortical areas AIP, F5, and 'new' M1. Using a delayed grasping task with a large number of distinct objects, we predicted 27 degrees of freedom (DOF) of the hand and arm from these brain areas, which, to our knowledge, is the most complete decoding study of finger, wrist, and arm joints undertaken so far. We found that all 27 DOF could be predicted accurately from either of the three cortical areas with M1 delivering the highest accuracy, followed by F5. AIP yielded lowest decoding performance, but performed significantly above chance. To our knowledge, this is the first time that continuous decoding of grasp kinematics was achieved using spiking activity recorded simultaneously from areas AIP, F5, and M1.

Basic procedures
For decoding hand movements, two purpose-bread macaque monkeys (Macaca mulatta; animal Z: female, 7.0 kg; animal M: male, 10.5 kg) were trained to grasp a wide range of different objects (Schaffelhofer et al 2015a) while wearing an instrumented glove (Schaffelhofer and Scherberger 2012). After training was accomplished, both animals were implanted with head holders on the skull and subsequently with microelectrode arrays in cortical areas AIP, F5, and M1. In the following recording sessions, spiking activity was recorded from these electrodes together with the kinematics of the primate hand. Animal care and all experimental procedures were conducted in accordance with German and European law and were in agreement with the Guidelines for the Care and Use of Mammals in Neuroscience and Behavioral Research (National Research Council 2003).

Experimental setup
During training and experimental sessions, animals were sitting upright in a customized animal chair with their head fixed. In order to protect the instrumented glove, we constrained the passive (non-grasping) hand with a plastic tube that encompassed the forearm in a natural posture. A capacitive switch at the side of the active (performing) hand allowed detecting the animal's hand position at rest. All graspable objects (figure 1(b)) were placed at a distance of ∼25 cm in front of the animal at chest level. We used a PCcontrolled turntable (figure 1(a)) to pseudo-randomly present the objects. Light barriers and a step motor ensured precise object positioning. Additional light barriers beneath the turntable detected the time when the monkey lifted the displayed object.
During the experiments, hand kinematics were monitored by a data glove as shown in (c). Implantation sites of six microelectrode arrays in area AIP, F5, and M1 are illustrated in (d) (IPS, intraparietal sulcus; CS, central sulcus; AS, arcuate sulcus). 1st and 2nd principal components of recorded kinematics (recording M111913) are shown in (e). Each marker corresponds to the first two principal components of the 27 DOF averaged across the hold epoch while the monkey was holding a specific object. Symbols correspond to objects as depicted in (b), marker size reflects object size.
turntable that contained objects of abstract forms (figure 1(b)). All objects had a uniform weight of 120 g, i.e., independent of their size and shape.
In addition to these 48 objects, animals were also trained to grasp a handle object in two different ways, either with a precision grip (using thumb and index finger) or a power grip (enclosure of handle using all digits). These grip types were detected by sensors in cavities at the middle of the handle and by a light barrier inside the handle aperture, respectively.
Eye position was measured with an optical eye tracker (ISCAN, Woburn, MA, USA), and hand kinematics were recorded by a custom-built hand tracking device (Schaffelhofer and Scherberger 2012; see below). To prevent interference with the electro-magnetic based hand tracker, ferromagnetic materials had to be avoided in the experimental setup, including animal chair, table and manipulanda (Raab et al 1979, Kirsch et al 2006. All task relevant behavioral parameters (eye position, stimulus presentation, switch activation) were controlled by custom-written behavioral control software that was implemented in LabView (National Instruments).

Behavioral paradigm
Monkeys were trained to perform grasping actions in the dark in order to observe motor signals in the absence of visual information. To realize this approach, a delayed grasping paradigm (turntable task) was implemented (Schaffelhofer et al 2015a): monkeys initialized a trial by pressing a capacitive switch in front of them. This action turned on a red LED that the animals had to fixate. After fixating for 500-800 ms, a target object located next to the fixation LED was illuminated for 700 ms (cue epoch), followed by a waiting period in the dark (planning epoch, 500-1000 ms) in which the animals had to withhold movement execution but continue to fixate until the fixation LED blinked ('go' signal). Then, animals grasped the object and held it up for 500 ms (hold epoch).
Grasp movements performed on the handle (grasping box task, see figure 1(b) 'Handle') were executed with the same paradigm with the exception that one of two additional LEDs instructed the monkeys during the cue period to perform either a precision grip (yellow LED) or a power grip (green LED). Incorrect trials were aborted immediately; correct trials were rewarded with juice.

Signal procedures and imaging
Prior to surgery, we performed a 3D anatomical MRI scan of the animal's skull and brain to locate anatomical landmarks (Townsend et al 2011). For this, the animal was sedated (e.g., 10 mg kg −1 ketamine and 0.5 mg kg −1 xylazine, i.m.), placed in the scanner (GE Signa HD or Siemens TrioTim; 1.5 Tesla) in a prone position, and T1-weighted images were acquired (iso-voxel size: 0.7 mm 3 ).
In an initial procedure, a head post (titanium cylinder; diameter 18 mm) was implanted on top of the skull (approximate stereotaxic position: midline, 40 mm anterior, 20 deg forward tilted) and secured with bone cement (Refobacin Plus, BioMed, Berlin) and orthopedic bone screws (Synthes, Switzerland). After recovery from this procedure and subsequent training with head fixation, each animal was implanted in a second procedure with six floating microelectrode arrays (FMAs; MicroProbes for Life Science, Gaithersburg, MD, USA). Specifically, two FMAs were inserted in each area AIP, F5, and M1 (see figure 1(d)). FMAs consisted of 32 non-moveable monopolar platinum-iridium electrodes (impedance: 300-600 kΩ at 1 kHz) as well as two ground and two reference electrodes per array (impedance <10 kΩ). Electrode lengths ranged between 1.5 and 7.1 mm and were configured as in Townsend et al (2011).
FMA implantation locations are depicted in figure 1(d). In both animals the lateral array in AIP was located at the end of the intraparietal sulcus at the level of the parietal area PF, whereas the medial array was placed more posteriorly and medially at the level of the parietal area PFG (Borra et al 2008). In area F5, the lateral array was positioned approximately in area F5a All surgical procedures were performed under aseptic conditions and general anesthesia (e.g., induction with ketamine 10 mg kg −1 i.m., and atropine 0.05 mg kg −1 s.c., followed by intubation, isofluorane 1-2%, and analgesia with buprenorphene 0.01 mg kg −1 s.c.). Heart and respiration rate, electrocardiogram, oxygen saturation, and body temperature were monitored continuously. Systemic antibiotics and analgesics were administered for several days after each surgery. To prevent brain swelling while the dura was open, the animal was mildly hyperventilated (end-tidal CO 2 <30 mmHg) and mannitol kept at hand. Animals were allowed to recover fully (∼2 weeks) before behavioral training or recording experiments recommenced.

Hand kinematics
Previously, we developed an instrumented glove for small primates that allowed recording the animal's finger, hand, and arm movements in 27 DOF (Schaffelhofer and Scherberger 2012). The glove was equipped with 7 electro-magnetic sensor coils that enabled hand and arm movement tracking at a temporal resolution of 100 Hz without depending on line of sight to a camera ( figure 1(c)). This way, finger movements could be tracked continuously even when sensors were located behind or below objects. The method is described in detail in Schaffelhofer and Scherberger (2012); in short, it exploits the anatomical constraints of the primate hand and combines it with the 3D position and orientation information of the sensors located at the fingertips, the hand's dorsum, and the lower forearm to obtain a full kinematic description of the animal's hand and arm.
Recorded 3D positions of the finger, hand, and arm joints were compared with the animal's anatomy. Rarely occurring errors (e.g. due to short freezing of the kinematic tracking) were set to NaN. Then, each joint angle was calculated for every recorded time step and afterward linearly resampled to exactly 100 Hz. This way, missing values were linearly interpolated.
Specifically, the following 27 joint angles were extracted: finger flexion/extension in the carpometacarpal (CMC), metacarpophalangeal (MCP), and interphalangeal joint of the thumb and the MCP, proximal interphalangeal, and distal interphalangeal (DIP) joint of the index, middle, ring, and little finger (fingers 2-5); finger abduction/adduction in the thumb CMC and the MCP of fingers 2-5; radial/ulnar deviation (yaw), flexion/extension (pitch), and pronation/ supination (roll) of the wrist; flexion/extension of the elbow; and shoulder adduction/abduction, flexion/extension, and internal/external rotation.
Furthermore, the velocity of every DOF at each time sample was calculated and added to the kinematics. This resulted in a 54xT-matrix with T as the total number of samples. Note, that only data that was recorded while the monkey was exposed to the handle or turntable task was used and concatenated into the matrix, including correctly and incorrectly performed trials, reward epochs, and the time epochs between trials (inter-trial intervals). Data recorded while turntables or the Grasping Box were exchanged by the experimenter were not included.
For illustration purposes, the 27 DOF were averaged during the hold epoch across all successful trials of one recording session (M111913), and a principal component analysis was performed. The first two principal components of the kinematics of each trial are illustrated in figure 1(e), demonstrating that we were able to cover a wide range of hand configurations together with sampling very small differences in hand kinematics.

Neural recordings
Extracellular signals were recorded simultaneously from six FMAs (6×32 channels) that were permanently implanted into area AIP, F5, and M1 (two FMAs in each area). All arrays provided sufficiently large numbers of recording channels within the specified electrode impedance (∼500 kΩ) across all electrode rows. Raw signals were sampled at a rate of 24 kHz with a resolution of 16 bit, and stored to disk together with the behavioral and the kinematic data using a RZ2 Biosignal Processor (Tucker Davis Technologies, FL, USA). Offline, raw data was bandpass filtered (0.3-7 kHz) and spikes were detected (threshold: 3.5 standard deviation). Spike sorting was performed first with WaveClus (Quiroga et al 2004) for automatic sorting, followed by manual inspection and revision using OfflineSorter (Plexon TX, USA). Spikes of single and multiunits were further analyzed.
2.7. Decoding 2.7.1. Decoding algorithm. For offline prediction of the hand and arm kinematics, we employed a Kalman filter (Kalman 1960) as described in Wu et al (2004Wu et al ( , 2006. A Kalman filter assumes a linear relationship between the kinematics at time instant k and the following time instant k+1 (step size: 10 ms), as well as between the kinematics and the neural data at time k: Here, x k+1 is the 54×1-vector of the 27 angles and their velocities at time k+1, A a 54×54-matrix relating the kinematics of one time step to the following one, z k a vector of length N containing the neural data at time k where N denotes the total number of recorded units, H a N×54matrix relating the kinematics at time k to the corresponding neural information, and w k and q k noise terms that are assumed to be normally distributed with zero mean, i.e., For decoding a complete recording session, we used 7-fold cross-validation. For this, the recording session was divided into seven data sets of equal length and composition: each task type (i.e., grasping box task and data from each turntable) was divided into seven parts of equal length and these parts were randomly attributed to the seven sets, so that eventually each set contained data of each turntable and the grasping box task.
To decode the kinematics of the ith of the seven sets, the variables of the Kalman filter, namely A, H, W, and Q, were calculated by using the neural and kinematic data of the remaining six sets (decoder training). Then, the kinematic data of the ith set (27 DOF and their respective velocities) were iteratively predicted by setting the initial value (and velocity) at the first time step of each DOF randomly within the observed range of the respective DOF/velocity during the recording (interval between the smallest and highest value). For each subsequent time step, the Kalman filter first calculates an a-priori estimation of the kinematics based on the kinematic state of the previous time step. Then, this prediction is updated based on the corresponding neural activity (a posteriori estimation) (Welch and Bishop 2006). This procedure was carried out for each of the seven data sets, hence leading to the complete decoding of the entire recording session. The decoding procedure was implemented in Matlab (MathWorks, Natick, MA, USA).
For illustration purposes, spatial trajectories of the recorded and decoded joints were used to drive a 3D skeletal model that was scaled to match the anatomy of each primate (Schaffelhofer et al 2015b). The model transformed the joint angles into the 3D positions of a skeleton and thus provided a 3D visualization of the complete hand and arm kinematics (see supplemental videos). 2.7.3. Variation of decoding parameters. The neural data used for decoding was analyzed in the following way in order to systematically investigate the impact on decoding performance: (1) For each unit, spike times were binned in specific time intervals, i.e., the number of occurring spikes in a specific time bin was counted. Since the kinematic data was predicted by the Kalman filter with a frequency of 100 Hz, the bins for counting the spikes were shifted in time steps of 10 ms. This resulted in overlapping bins if the window length was larger than 10 ms. To test the impact of bin size on the decoding performance, bin length was systematically varied.
(2) In addition to bin length, we also tested whether the introduction of a time lag between the time series of the kinematics and the series of spike counts would lead to changes in decoding performance. We therefore systematically varied the lag, using both positive and negative values: a lag length l<0 and a bin length b>0 translated into predicting the kinematics at time point t k with the spike count in the time window [t k +l-b, t k +l] in the a posteriori step of the Kalman filter, meaning that neural activity preceded the kinematics. Positive lag lengths indicating that neural data used for decoding succeeded the kinematic prediction were tested as well. A positive lag l>0 and bin length b>0 selected the spikes in the time window [t k +l, t k +l+b]. For the purpose of clarity, we used the notation l=−0, b>0 to refer to the time window [t k -b, t k ] and l=(+)0, b>0 for the time window [t k , t k +b] of spike counting.
For each recording and cortical area or combination of areas, we tested the decoding performance for different combinations of bin and lag lengths, ranging from 0 to ±180 s for lags and 10-250 ms for bin lengths.
2.7.4. Evaluation of decoding performance. To assess decoding performance, we calculated both Pearson's correlation coefficient (CC) and the relative root mean squared error (rRMSE) for each DOF across the entire decoded recording session to compare the measured trajectory of a joint angle with the decoded trajectory as predicted by the Kalman filter. CC was defined as where N is the total number of samples, x k and x k denote the true and decoded DOF at time k, respectively, and x ̅ and x are the means across all N samples of the true and decoded DOF, respectively. rRMSE was determined by calculating the root mean squared error (RMSE) for each DOF and then normalizing it by the range of the respective DOF: where d 5 and d 95 are the 5th and 95th percentile of all recorded data samples of the respective DOF. This normalization allowed comparing prediction errors of joint angles relative to their operational range.
For each neuronal dataset we determined the optimal lag and bin length combination that maximized CC or minimized rRMSE. Most of the time the optimal parameter combinations for CC and rRMSE matched. CC and rRMSE are both important performance measures, however, a trajectory prediction that captures the movement shape with an offset bias might have a high CC, but will also have a high rRMSE. In cases where the optimal parameter combination did not match for CC and rRMSE, we therefore favoured the parameter combination yielding better rRMSE, which generally also had a high CC. This best parameter combination was then applied to predict the movement kinematics of each dataset.
2.7.5. Chance decoding performance. Kalman filter decoding uses the previous kinematic state as the first part of its prediction (a-priori estimation). In case of a very regular movement, the filter might achieve a high performance by simply utilizing the regularity of the oscillatory kinematics, i.e., without relying on neural information at all. In such a case, a high decoding performance might be falsely attributed to the neural data.
To make such effects transparent and to avoid them as much as possible, we included both correct and incorrect trials (where the kinematics often did not follow the typical temporal structure of the correct trials) in the decoding. Furthermore, we calculated a chance level performance to illustrate prediction accuracy of a decoder relying solely on inherent kinematic information. By comparing the resulting decoding performance to the accuracy obtained with standard decoding, an estimation was obtained of how much the filter was actually utilizing neural data for prediction. To determine a chance performance level, we removed any potential taskrelevant information contained in the spike train population by applying the following random shift method: spike sequences of individual units were shifted in time by a random shift length of at least 20 s in either positive or negative time direction. Neural data that was shifted beyond the end or the beginning of the recording interval was circularly reinserted at the beginning or end respectively. This procedure destroyed any potential relationship between the kinematics of the limb and the spike sequence while keeping the inherent temporal structure of the spike sequence intact. Importantly, the random time shifts were different for each unit, which destroyed also any potential correlations in the neuronal population. Using this temporally reshuffled neural data, we carried out decodings and calculated the performance measures, CC and rRMSE, from which we determined the chance level performance as follows: By repeating this surrogate method ten times, we established the probability distribution of CC and rRMSE during chance performance for each area and recording session. To test if the CC or rRMSE obtained from the respective standard decoding differed significantly from chance, we compared the respective CC or rRMSE value to the median of the chance probability distribution (two-sided sign test). If p<0.05, we considered the performance of the standard decoding to be significantly different from chance.
When comparing the mean performance across all ten recording sessions to chance (as in the analysis of decoding performance for proximal and distal groups of joints, figure 6), a chance distribution across recording sessions was composed out of the means of the chance distributions of each recording. Then, a two-sided signed-rank test was performed to determine whether the median of that chance distribution differed significantly from the mean performance of the standard decoding across all ten recording sessions.

Results
We recorded 27 DOF of hand, wrist, and arm kinematics simultaneously with single and multiunit activity in the hand areas of primate motor (M1), premotor (F5), and parietal cortex (AIP) while animals performed the delayed grasping task. In total, ten recording sessions were analyzed for this study (monkey Z: six sessions, monkey M: four sessions).

Kinematic and neuronal data
Representative example trajectories of the 27 DOF from recording Z032012 are shown in figure 2 while the monkey grasped and lifted a horizontal bar, a cube, and a small ball. As can be clearly observed, the objects were grasped with different hand and arm configurations, which was an important goal of the task design. In total, about 50 distinct objects of various size and shape were tested (see experimental setup in materials and methods) covering a wide range of hand configurations. This allowed us to detect very small differences in hand kinematics (see also figure 1(e) in materials and methods).
Neural activity was recorded from six 32-channel FMAs that were implanted in M1, F5, and AIP (two each; see materials and methods). Table 1 provides an overview of the number of single and multiunits recorded in each cortical area and each session. An example population is illustrated in figure 2 (bottom) as a rasterplot of neural activity that was recorded together with hand and arm kinematics. As expected, units in M1 displayed a clear modulation of firing rate during movement execution, whereas their activity was generally low during periods with no movement. During movement periods, an increase of spiking activity was also observed in F5 and AIP units. The extent, however, was smaller than in M1 and smallest in AIP. In contrast, AIP showed a higher firing rate during periods in which the object was presented to the monkey and when the animal was planning the movement. A similar behavior could be observed for F5, even though planning activity was decreased in comparison to AIP. Such activity patterns have been reported previously for single and multiunits in M1 (Poliakov and Schieber 1999, Umilta et

Decoding of 27 DOF
Since a clear relationship between neural activity and kinematics could be observed, we decoded 27 DOF of the hand, wrist, and arm continuously over time using single and multiunit activity from M1, F5, and AIP. Figure 3 shows example trajectories (both true and prediced) from recording Z032012 of hand, wrist and shoulder joints over a time course of 80 s while the monkey was lifting objects of the mixed turntable. Decodings were performed with units from either area M1 ( figure 3(a)), F5 ( figure 3(b)), or AIP ( figure 3(c)). For all three areas, the estimated trajectories followed the real one accurately: the timing of both movement onset and termination were captured very precisely. The movement amplitude seemed to be reproduced best when decoding with data from area M1 and was slightly diminished when predicting from areas F5 or AIP. Furthermore, in time periods without movement (inter-trial intervals) the predicted trajecory contained more jittering when decoding from AIP than from F5. t div =1761.5 s (marked with a pink triangle on top) is one of the time instants when the data was divided in order to carry out the 7-fold cross-validation: here the Kalman filter was re-trained and decoding was restarted with new initial values within the range of the respective DOF (see decoding procedure in materials and methods). This led to brief discontinuities in most curves, since all decoding trajectories were randomly reset at this time instant. However, the Kalman filter managed to bring the trajectory back close to the true amplitude within 300-400 ms, hence illustrating the power of the algorithm.
For illustration purposes, the trajectories of both the recorded and the decoded joints were used to drive a 3D skeletal model (Schaffelhofer et al 2015b) (see materials and methods). Supplemental video 1 shows the movements of the real arm and hand together with the decoded movements using activity from all three areas while the monkey was lifting three different objects from the mixed turntable. In contrast, supplemental video 2 compares the predicted movements when decoding was done from either of the three areas or from all areas combined.

Decoding performance of different areas
To quantify the decoding performance of the different areas, we calculated the CC and rRMSE for each DOF over the entire recording session (Z032012) and averaged the CCs across all 27 DOF (figures 3(d) and (e)). As expected, decoding with neural signals from area M1 yielded the best performance in comparison to decoding from F5 or AIP. However, there was no significant difference in the mean between M1 and F5, neither for CC nor rRMSE (one-way ANOVA and Tukey-Kramer multicomparison test, p>0.05). The similar CC demonstrated that the actual movement could be captured similarly precisely when decoding from M1 or F5. However, predictions made with F5 did not seem to capture the amplitude of the movement as well as when decoding from M1, and there was more noise during resting phases (see also figures 3(a) and (b)), which was reflected in a higher rRMSE for F5. AIP yielded good performance values as well; they did not differ significantly from those obtained from area F5, however, when comparing AIP and M1, a significant difference was found in the mean correlation coefficients (p<0.05).
In addition to predicting from one brain area alone, we also decoded from pairs of areas (AIP&F5 and M1&F5) and with all three areas combined. This allowed us to investigate whether decoding performance could be improved by combining information from different brain areas. Decoding from M1 and F5 in combination increased the mean CC and decreased the mean rRMSE slightly, but not significantly, in comparison to decoding from M1 alone. Similarly, there was no significant improvement when decoding from F5 and AIP in combination, as compared to decoding from F5 alone. As expected, decoding performance was best when all three areas were combined. However, there was no significant improvement in comparison to using only data from M1.
We were curious to evaluate to which extent the decoding performance could be attributed to neural information, as opposed to movement information inherent in the kinematics, e.g., as present in rhythmic or oscillatory movements. In other words, we wanted to determine the decoding performance in the absence of any meaningful neuronal information with respect to movement. Therefore, we performed simulated decodings with randomly reshuffled spiking activity in which neural information was temporally uncoupled from movement kinematics to determine a chance level performance (random shift method, see materials and methods for further explanation). This method was repeated ten times for each neuronal dataset (area or combination of areas), and CC and rRMSE were calculated for each DOF and averaged across all 27 DOF. Mean and standard deviation (across ten repetitions) are shown in figures 3(d) and (e) (orange lines). Clearly, decoding with the original neural data significantly outperformed chance performance both in terms of CC and rRMSE (two-sided sign test, see chance decoding performance in materials and methods, p<0.002). This demonstrated that significant movement information was present in these brain areas, which the decoder was able to translate into accurate movement predictions. Figure 4 illustrates the combined results across all recording sessions (six sessions for monkey Z, four sessions for monkey M; see also table 1). Overall, these results confirm the findings in the example recording of figures 3(d) and (e). When decoding with neural activity from a single area, M1 yielded best results, followed by F5 and AIP; all accuracies differed significantly from each other (one-way ANOVA and Tukey-Kramer multicomparison analysis, p<0.05). Combining two or three areas for decoding did not improve the decoding performance significantly, as compared to using only the better performing single area.
Finally, we compared the performances between monkeys for each area. There were no differences in means or variances for M1 and F5 (two-sample t-test, p>0.10, and Bartlett's test, p>0.05, respectively). However for area AIP, the variances differed significantly between animals (Bartlett's test, p<0.04). Together, accurate movement prediction was possible with each of the three cortical areas M1, F5,  and AIP, combining these areas increased the decoding accuracy not significantly, and decoding results were highly comparable between monkeys.

Influence of number of units on decoding performance
Because the number of units was lowest for area AIP (see table 1), an important control was to rule out that AIP's lower decoding performance was simply due to a lower number of recorded units. To address this question, we randomly selected a fixed number of units from each area M1, F5, and AIP and decoded with this data set. This was done repeatedly (ten times) for each area. Resulting performances were averaged across all 27 DOF and repetitions. Figure 5 shows the mean decoder performance across all ten recording sessions for randomly selecting 5, 15, 25, and 35 units. In addition, the rightmost bars ('max') show the decoding performance when, for each recording session, the maximum amount of units available in all of the areas, i.e., the smallest of the three set sizes of units in M1, F5, and AIP, was randomly selected for each session (i.e., 45 units for session Z120511, 61 units for session Z011312, etc, see table 1). Regardless of the actual number of units, M1 yielded the best decoding performance for both CC and rRMSE, followed by F5 and AIP. Decoding performance (both CC and rRMSE) differed significantly between M1, F5, and AIP, regardless of the amount of units chosen for decoding (one-way ANOVA and Tukey-Kramer multicomparison analysis, p<0.05). The ranking of areas in terms of decoding performance therefore did not depend on the number of available units. Furthermore, as expected, decoding performance increased with the number of units included for decoding, and in individual recording sessions (data not shown) the standard deviation of CC and rRMSE decreased with growing number of units, hence demonstrating that results became more consistent.
Remarkably, performance was above chance even when using only a small number of units (e.g., five). For each recording, we compared the distribution of performance levels obtained by ten repetitions of randomly selecting five units with the distribution of chance levels obtained by ten repetitions of the random shift method obtained by using all available units in an area (Mann-Whitney U-test): when using M1 for decoding, the performances (both CC and rRMSE) differed significantly from chance for all ten recording sessions (p<0.001). For F5, there was a significant performance difference to chance in 9 and 10 of 10 recordings for CC and rRMSE, respectively (p<0.05). For area AIP, decoding with five units was able to yield a significantly better performance than chance in eight out of ten recordings for CC (p<0.05) and in all recordings for rRMSE (p<0.05).
We can therefore conclude that all areas carry substantial amount of information about hand and arm kinematics that enabled kinematic prediction with accuracy better than chance performance even with a small number of units. Decoding accuracy monotonically improved with increasing number of units used for decoding, whereas the ranking of areas in decoding performance was independent of the number of units used for decoding.

Decoding performance of proximal and distal movements
Decoding performance for different groups of joints was compared in figure 6. Although the electrode arrays were carefully placed in the hand area of motor, premotor, and parietal cortex, decoding of elbow and shoulder joints ('arm joints' in figure 6) generally performed best for all areas (significantly different from all other groups of DOF except for wrist angles when decoding with AIP and using rRMSE for judgement, one-way ANOVA and Tukey-Kramer multicomparison analysis, p<0.05). This is likely due to the task design, which concentrated on eliciting a high variability in finger kinematics but showed a more stereotypical demand on elbow and shoulder movements. Therefore, these trajectories were easier to predict. We also observed less noise in the recording of those angles, which might have affected the decoding performance. However, since this group consisted of only 4 of all 27 total DOF, its bias towards higher prediction accuracy was only marginal in the overall result, as presented in figure 4.
Besides the difference between proximal and distal joint angles, there was only little variation in the decoding performance between hand and wrist joints (detailed results of significant differences are shown in figure 6). Thumb and little finger joints tended to be predicted with slightly less accuracy than the other DOF of the wrist and hand. Thumb performance could be attributed to the fact that the thumb was the finger with most variation in movement and therefore was also most prone to (kinematic) recording noise. Compared to middle and ring finger that tended to move in correlation (sometimes also together with the index finger), both thumb and small finger movements were more independent. In some grasping conditions, thumb and small finger were not actively involved in the grasp (i.e., they did not touch the object), but moved alongside the other fingers in a more passive way (e.g., when grasping an object that did not require contact with the thumb or little finger). These fingers could then exhibit movement patterns with higher variation across the tested grip types.
In addition, figure 6 shows the respective chance performance levels. As it is clearly visible, decoding performance of all joint groups was significantly better than chance level (obtained by the random shift method; see materials and methods) (two-sided sign test, see chance decoding performance in materials and methods, p<0.002).

Optimal decoding parameters
As mentioned above, different parameter combinations (varying lag and bin lengths) were tested for decoding and evaluated in terms of CC and rRMSE. Bin lengths of 10-250 ms were systematically combined with lags ranging from 0 to ±180 s and the best combination with respect to CC and rRMSE was determined. This procedure was repeated for all recording sessions and areas. In 95% of these optimal combinations a bin length of 10 ms was the best duration for counting spikes. Optimal lags, however, varied more strongly between areas and recordings. Figures 7(a) and (b) illustrate the decoding performance of M1, F5, and AIP as a function of lag lengths while bin length was kept constant at 10 ms. Furthermore, the cumulative distribution of optimal lag lengths (across recording sessions) for each area is provided in figures 7(c) and (d). As described in section 2.7.3, materials and methods, negative lag lengths correspond to neural data that preceded the kinematics to be decoded, whereas positive lags indicate that neural activity followed the kinematics in time. To make this difference more obvious, the discontinuity between −0 and 0 was marked on the x-axis.
For area M1, the lag lengths yielding the best decoding performance for each recording clustered around durations from −70 to 10 ms. A two-sided sign test showed that the median of these lag lengths (= −10 ms) was not significantly different from zero for CC (p=0.07), however it was for rRMSE (median=−50 ms, p=0.04). For area F5, the optimal lag lengths were shifted more into the negative range, indicating that neural data used for decoding preceded the kinematic prediction: for both CC and rRMSE, the median (= −40 ms for CC, −60 ms for rRMSE) was significantly different from zero (p=0.002 for both CC and rRMSE). However, this was not true for AIP: there, the median of optimal lag lengths for decoding was not significantly different from zero (p=1 for CC, median=−0 ms; p=0.45 for rRMSE, median=−15 ms). Together, optimal lag lengths for decoding with area M1 were close to zero with a shift into the negative range, which became even larger for decoding with F5. Optimal lags for decoding with AIP however clustered around zero. This becomes also apparent in figures 7(c) and (d) where the lines for F5 are shifted furthest to the negative range, followed by the lines for M1 and AIP when looking at lag lengths around zero.
Moreover, although medians of optimal lag lengths did not differ significantly between areas (Kruskal-Wallis test, p>0.05), these plots demonstrate that there was much more variation in optimal lags for AIP than for the other two areas: for CC the respective standard deviation of optimal lag distributions were 30.62 ms (M1), 29.83 ms (F5), and 83.11 ms (AIP). This difference was significant (non-parametric Levene's test, p<0.03). Similar results were observed for rRMSE with standard deviations of 37.06 ms (M1), 38.89 ms (F5), and 452.86 ms (AIP). Here AIP was also significantly different (non-parametric Levene's test, p<0.01).
Furthermore, when decoding with data from AIP, a second but smaller peak in performance appeared for lags around −1700 ms (figures 7(a) and (b)), and the ranking of the three areas in terms of decoding performance was reversed for these long lags. Contrary to AIP, the performances for F5 and M1 improved steadily the closer the neural data used for decoding lay in time compared to the kinematics to be predicted. For lag lengths longer than approximately −600 ms, F5 still yielded higher decoding accuracy than M1. For positive lags, decoding performance monotonically decreased and there was no second peak in either area.
In conclusion, while consistently observing a bin length of 10 ms as optimal throughout areas, the impact of lags on decoding performance varied largely, and optimal lag lengths yielding highest decoding performance varied between M1, F5, and AIP. Figure 6. Mean CC's (a) and rRMSE's (b) were averaged across groups of joints (thumb joints, index finger joints, middle finger joints, ring finger joints, little finger joints, wrist joints, and arm joints including shoulder and elbow joints) and recording sessions separately for M1, F5, and AIP. Bars and errorbars: mean and standard deviations. Orange lines and errorbars: chance decoding mean and standard deviations (see materials and methods). Columns above each bar illustrate significant differences in decoding performance of a particular joint group to all other joint groups (significant groups emphasized by color) (one-way ANOVA and Tukey-Kramer multicomparison analysis, p<0.05).

Discussion
This paper investigated the possibility of decoding complete hand, wrist, and arm kinematics (represented by 27 DOF) from single and multiunit activity in the hand areas of motor (M1), premotor (F5), and parietal cortex (AIP). To our knowledge, this is the first study that combines and compares these three grasping areas for predicting versatile, continuous hand kinematics. Simultaneous recordings of population activity from these three areas with multi-electrode arrays made it possible to examine differences in decoding performance between these areas and evaluate the information content with respect to hand and arm kinematics.

Movement reconstruction with primary and premotor cortex
Continuous trajectories of 27 joint angles could be reconstructed accurately over time using single and multiunit activity from M1, F5, or AIP. However, decoding performance varied between these areas. Highest performance was achieved when using M1 for decoding, followed by F5 and AIP (figures 3 and 4). It is known that various parameters regarding hand and arm movements such as joint and muscle representations, wrist position, force, and grasp configurations are encoded in M1 (Thach 1978, Georgopoulos et al 1986, Ashe and Georgopoulos 1994, Taira et al 1996, Ashe 1997, Kakei et al 1999, Rathelot and Strick 2006, Umilta et al 2007 and F5 (Rizzolatti et al 1988, Kakei et al 2001, Raos et al 2006, Fluet et al 2010. Furthermore, F5 has both strong projections to the digit area of M1 (Muakkassa and Strick 1979, Matelli et al 1986, Dum and Strick 2005, Borra et al 2010 as well as direct connections to the spinal cord via cortico-spinal neurons originating from F5 (He et al 1993, Galea andDarian-Smith 1994). F5 has therefore been considered to operate-to some extent-at the same hierarchical level as M1 (Dum and Strick 2005). Based on these findings, it is not surprising that both M1 and F5 were able to predict joint kinematics quite precisely (figures 3 and 4).
The suitability of M1 and F5 for decoding arm and hand kinematics has previously been investigated (Carmena et al 2003, Lebedev et al 2005, Ben Hamed et al 2007, Vargas-Irwin et al 2010, Bansal et al 2012, Aggarwal et al 2013. However, different decoding algorithms, task types and decoding parameters have been employed that makes it difficult to compare these results with ours. Furthermore, the number of decoded DOF was much lower in most of these studies. However, the three following studies were closest related to ours: first, Aggarwal et al (2013) reported decoding performances obtained with activity from M1, PMd, and PMv that closely matched our results for predictions from M1 and F5 (PMv) both combined and separately. Second, Vargas-Irwin et al (2010) recorded single unit activity from the hand region of M1 and reported similar decoding accuracy levels for hand kinematics in an objectfetching task for activity from M1. Finally, third, Bansal et al (2012) decoded hand grasping movements separately from premotor and motor cortex and found no significant difference in decoding accuracy between M1 and PMv. However, the reported decoding performance (CC) was lower in their study as compared to ours.
All three studies included circumstances not present in our study that might have influenced the decoding performance in the positive direction: first, in all studies the number of objects used for grasping was substantially lower than in our task, therefore leading to less complex grasp kinematics. Second, these studies used only correctly performed trials or selected time periods of correct trials for prediction, whereas our study predicted all occurring kinematics, including incorrect trials and inter-trial intervals. We deliberately included these epochs in order to mimic a more natural behavior, where also unexpected movements occur in addition to the ones demanded by task design. Both points introduced more variance in the kinematics and increased the difficulty for precise decoding.
Third, the studies of Bansal et al (2012) and Vargas-Irwin et al (2010) used only a selection of best units for decoding and an optimal time lag between individual DOF and neural activity was determined separately for each neuron. In contrast, our study used only a general optimal bin and lag length for the entire neuronal population. Evaluation of unit contribution to kinematic features was left to the automated training of the Kalman filter. Despite all these points, our decoding accuracy was the same, or even higher, than reported in those studies.
Contrary to units in M1, a group of neurons in F5 already responds to object presentation (Murata et al 1997, Raos et al 2006, Lehmann and Scherberger 2013. This activity might have mistakenly been interpreted by the decoder as movement information. Hence, by not only decoding sequences of movement, but also including long periods of resting while the monkey was mentally preparing its grasp, overall decoding performance of F5 might have been decreased compared to M1. Furthermore, prediction accuracy for using M1 might have benefited from the fact that our M1 electrodes were located in the anterior bank of the central sulcus, as opposed to the cortical surface anterior to the central sulcus, where direct, cortico-motoneuronal units are located (Rathelot and Strick 2009). Neurons in the anterior bank of M1 are therefore closely linked to these cortico-motoneurons, which could explain the suitability of this area for the decoding of hand kinematics and the strong neural selectivity for particular finger movements and grip-types (Schaffelhofer et al 2015a).

Movement reconstruction with parietal cortex
Decoding with AIP yielded the lowest decoding performance. It is possible that the signal-to-noise ratio of spikes versus background was lower in AIP than in M1 and F5, which might have influenced decoding performance. Despite this fact, decoding accuracy of AIP was significantly above chance (figures 3 and 4), suggesting that AIP carries substantial information about movement kinematics that can be decoded. These signals in AIP do not necessarily have to be movement commands, but could represent other information related to movement execution like object features relevant for grasping. However, the temporal movement characteristics appeared to be precisely captured by AIP, which supports the notion that motor related information in AIP is not exclusively categorical information, like visual features, but also includes temporal aspects of movement execution. In another decoding study where single unit activity from M1, F5, and AIP were used to predict kinematic states such as resting and movement, we found that AIP activity could clearly distinguish between these categories (Menz 2015). Furthermore, movement onset could be detected accurately with single units from AIP, therefore confirming the presence of temporal movement information (Menz 2015).
Predicted trajectories during resting periods often showed undesired jittering when decoding from AIP. This might be accounted for by the fact that AIP neurons respond to the visual presentation of objects and therefore exhibit higher activity already much before the actual movement execution (Sakata et al 1995, Murata et al 2000, Baumann et al 2009, Townsend et al 2011, Lehmann and Scherberger 2013. This visual activity could have affected the decoder and might have caused noisy movement residuals during periods of rest. To our knowledge, this is the first study investigating area AIP as a potential brain region for the continuous decoding of hand and finger movements. Earlier studies using AIP and F5 only predicted categorical variables like grip types: Lehmann and Scherberger (2013), Schaffelhofer et al (2015a), and Townsend et al (2011) found that F5 was better suited for decoding of grip types than AIP. Furthermore, Townsend et al (2011) were able to observe that decoding from the combined activity of F5 and AIP did not increase decoding performance significantly in comparison to F5 alone. Both points correlate well with our results (figures 3 and 4). The fact that decoding performance did not improve when neural activities from F5 and AIP were combined suggests that AIP largely shares its kinematics-related information with area F5.

Influence of population size on decoding performance.
We demonstrated that differences in decoding performance between AIP, F5, and M1 were consistent between animals (figure 4) and did not primarily depend on the number of neurons available for decoding, but strongly reflected the type of information encoded in these areas (figure 5). Already a very small number of randomly selected units raised decoding accuracy above chance. As suggested previously (Vargas-Irwin et al 2010), neurons in M1 and F5 do not seem to reflect a specific kinematic variable but instead carry information about a broader range of movement parameters, which would allow reasonable movement reconstruction even with a few neurons. This seemed to hold true also for units in AIP.

Reconstruction of proximal and distal joints
Although the task demanded a wide variety of finger movements and hand configurations, no substantial difference was found between decoding finger and wrist joints ( figure 6). However, a slight tendency towards a higher decoding performance for index, middle, and ring finger was observed. These fingers also showed higher synergies in their movements. Since the Kalman filter used combined state information of all DOF for predicting a single joint (a priori estimate), more information was available for joints moving in a correlated fashion and therefore the algorithm could predict such DOF more accurately. In contrast, thumb movements were more independent from the other fingers and showed more variation and were therefore harder to predict by the decoder. Furthermore, the little finger was not actively used for grasping in many conditions but rather moved along with the other fingers, often without clasping the object properly. Since it was used more passively, its movement intention might have been underrepresented in the brain, which might explain the slightly reduced decoding performance. For a few objects this was also true for the thumb, like the ones from the 'special' turntable (see figure 1(b)), where the grasp was mainly carried out with fingers 2-4 (similarly to the 'spherical power grasp' described in Cutkosky and Wright 1986).
In contrast, shoulder and elbow movements were predicted with significantly higher accuracy. Since our task was designed to focus on and eliciting dexterous grasping kinematics, arm movement kinematics were more stereotypical than those of finger and wrist kinematics. This might have facilitated the prediction of shoulder and elbow movements. Nevertheless, by including incorrect trials that were often aborted in the middle or contained unexpected movements, we tried to prevent the occurrence of a stereotypical temporal structure in the kinematics. However, since we recorded from highly trained animals, the number of incorrect trials was generally low.
In conclusion, although the arrays were carefully placed in the hand areas of primary, premotor, and parietal cortex, both distal and proximal joints could be predicted with high accuracy. This finding is supported by the observation of Vargas-Irwin et al (2010) that individual neurons in primary motor cortex encode information about both proximal and distal kinematics. McKiernan et al (1998) showed that a large portion of cortico-motoneuronal cells have muscle fields in both distal and proximal muscles. Since we recorded in the 'new' part of M1 (Rathelot and Strick 2009), it is likely that our recorded population contained a considerable portion of such cells.

Optimal decoding parameters and their network implications
In our study, a bin length of 10 ms yielded the highest decoding performance in 95% of the tested data sets regardless of which brain area was used for prediction. Previous decoding studies used bin lengths of 100-150 ms, however without explaining their choice (Vargas-Irwin et al 2010, Bansal et al 2012, Aggarwal et al 2013. Long time windows will likely include the time point when information content is highest but they also act like low-pass filters. In contrast, with a short window, changes in firing rate can be detected with higher temporal precision, which is advantageous especially for brain areas with lower tuning strength, like AIP. However, it is crucial to combine the bin length with an adequate time lag to find the optimal relation between brain activity and actual movement execution. Indeed, we found the time point from which data is used for decoding to show a high impact on decoding performance, and the optimal time lag varied considerably for the three areas ( figure 7).
When decoding from M1, we found the highest amount of information being present when neural data preceded the hand kinematics by 0-50 ms. Similar results were also reported by Morrow and Miller (2003), who decoded muscle activity from neurons in M1 with different lags. Their peak performance was obtained with a lag of −50 ms, it dropped faster than in our study when lags deviated from the optimal one, and it levelled off at a time lag of around ±250 ms. The faster decrease might be due to the use of a different decoding algorithm, or that sequentially recorded neuronal activity was used for decoding. Additional information encoded in the coherence or phase synchrony within the neuronal population (Averbeck and Lee 2004) was therefore not contained in their data but was available to our decoder due to simultaneous array recordings. This might have helped maintain a relatively high decoding performance for longer than optimal lag lengths. For a time lag of around ±500 ms however, their and our decoding performance was comparable.
Other studies determining the latency between spikes of cortico-motoneuronal cells in primary motor cortex and the onset of muscle activity of distal muscles reported an onset of postspike effects at about 10 ms after neuronal activity (Fetz and Cheney 1980, Kasser and Cheney 1985, Lemon et al 1986, McKiernan et al 1998. Since we recorded in the bank of M1 where cortico-motoneuronal cells are found (Rathelot and Strick 2009), our M1 population might be more directly linked to these cells, which might explain a tendency towards shorter lags in our, as compared to Morrow and Miller's results.
Optimal time lags for F5 were longer than for M1. However, if neural activity preceded kinematics by long negative lags, F5 yielded higher decoding accuracies than M1. This indicates that grasp intentions are present in F5 earlier than in M1, in agreement with Umilta et al (2007).
Furthermore, our time lag results are in line with the fronto-parietal grasp network hypothesis (Jeannerod et al 1995), which states that visual information about an object to be grasped is projected from AIP to F5 where a grip type is selected. Grip type information is then forwarded to neurons in M1 and translated into a motor command that is forwarded to the spinal cord. However, the role of AIP in this process seems to be more intricate. Decoding performance peaked for lags close to zero, however, we also found an additional smaller peak for time lags of about −1700 ms. It is very unlikely that in natural grasping movements, AIP represents movement intentions more than 1.5 s before execution. Instead, this second peak could be caused by the visual task instruction: the object is illuminated approximately 1.5 s before movement onset, which is known to cause a visual activation of AIP (Sakata et al 1995, Schaffelhofer et al 2015a). Furthermore, Schaffelhofer et al (2015a) showed that object information (and to a less extent grip type) could be decoded from AIP activity during the cue epoch of the task. Clearly, object features and hand configurations were coupled in our task, therefore cue information could have been used by the decoder for movement prediction. However, since visual information could only be used as a rough guideline for the prediction of individual movement kinematics, decoding performance was lower for these long lags compared to the accuracy peak when decoding with very short time lags.
Although decoding performance was highest for AIP for lags around zero, it is unlikely that a direct movement command is encoded in that neural activity. Mulliken et al (2008) found similar short lag lengths to yield the highest amount of mutual information between units in the PPC and movement parameters. They argued that PPC neurons could encode a forward estimate of the movement and are therefore not only involved in movement planning, but also in online movement control that utilizes an efference copy from premotor and motor cortices (Mulliken et al 2008). The same is likely the case for AIP. It has been suggested that so-called motor neurons in AIP could represent a copy of the motor plan from F5 in order to compare it to the visual object properties and for subsequent fine-tuning of the motor command (Sakata et al 1995, Sakata et al 1997, Sakata et al 1999, Murata et al 2000. Our findings are compatible with this hypothesis.

Conclusion
This study shows for the first time that high-dimensional reaching and grasping movements could be decoded continuously over time with high precision not only from areas M1 and F5 but also from the parietal area AIP. Although the decoding performance from AIP was inferior to that from motor and premotor cortex, our results support the idea that AIP not only encodes categorical visual information, as shown previously by other studies, but that AIP also contains temporal movement information, at least to some extent. However, kinematic information in AIP seems to be redundant to movement intentions encoded in area F5.