Spatiotemporal dynamics of responses to biological motion in the human brain

We sought to understand the spatiotemporal characteristics of biological motion perception. We presented observers with biological motion walkers that differed in terms of form coherence or kinematics (i.e., the presence or absence of natural acceleration). Participants were asked to discriminate the facing direction of the stimuli while their magnetoencephalographic responses were concurrently imaged. We found that two univariate response components can be observed around ~200 msec and ~650 msec post-stimulus onset, each engaging lateral-occipital and parietal cortex prior to temporal and frontal cortex. Moreover, while univariate responses show biological motion form-specificity only after 300 msec, multivariate patterns specific to form can be well discriminated from those for local cues as early as 100 msec after stimulus onset. By finally examining the representational similarity of fMRI and MEG patterned responses, we show that early responses to biological motion are most likely sourced to occipital cortex while later responses likely originate from extrastriate body areas.


Introduction
Understanding biological movements in the environment is relevant to everyday survival (e.g., detection of predators) and social functioning. It is not a surprise then, that the human brain has evolved to have a dedicated neural system for processing biological motion stimuli. Since Johansson (1973) introduced point-light displays and reported behavioural data, a large body of work has offered hints as to the properties of the underlying mechanisms: notably, Johansson's observation that the recognition of the human body is only supported when body motion is presented in successive frames, and breaks down when a static frame is presented was taken as evidence that biological motion mechanisms start with the processing of local motion signals. Indeed, findings from Mather et al. (1992) seemed to be congruent with this idea: when observers were presented with stimulus frames that alternated with a mask of blank frames, their ability to discern the walking direction of the point-light stimulus dropped to chance-level. These findings suggest that local motion detectors responsible for detecting frameto-frame changes in motion are crucial to solving the direction of the stimulus.
However, a collection of neuropsychological cases offered a different perspective. Vaina et al. (1990) reported that patients with lesions in the human motion complex (hMTþ) could not perform low-level motion integration tasks, but could identify biological motion. Patient LM, who could not detect stimuli in motion, could recognize action from biological motion (McLeod, 1996). In normal human observers, repetitive transcranial magnetic stimulation (TMS) over hMTþ, which severely impacts motion perception does not affect the perception of biological motion stimuli. Biological motion perception is, however, significantly impaired with stimulation of posterior superior temporal sulcus (pSTS) (Grossman et al., 2005). These cases are fascinating because they suggest lower-order motion perception is in fact not a requisite for at least some aspects of biological motion perception. This perspective is also supported by additional psychophysical work. For example, Beintema and Lappe (2002) presented observers with both regular walkers and sequential position walkers in which the position of each light-point was reallocated to a different position on the limb in each successive frame thereby disrupting local motion. They found that observers could still reliably detect a walking human figure and discriminate its walking direction. Later work seemed to offer some reconciliation of these data; that is, biological motion perception appears to be served by multiple, independent mechanisms e one that is sensitive to stimulus shape (form) and an alternate that is based on local invariants contained in body kinematics (Dayan et al., 2007;Troje & Westhoff, 2006). The local invariants appear to relate, at least in part, to the signature-like vertical acceleration in biological movements (Chang & Troje, 2009a, b). Behaviourally, the form-based and kinematics-based mechanisms appear to be differentially sensitive to learning and masking (Chang & Troje, 2009a, b), have orientation-dependency (Hirai, Chang, et al., 2011a;Troje & Westhoff, 2006), are differentially susceptible to attention and eccentricity (Hirai, Saunders, & Troje, 2011;Thompson et al., 2007;Thornton et al., 2002;Troje & Chang, 2013), and appear to have differing underlying genetic influences (Wang et al., 2018). But where are they reflected in the brain?
Early neuroimaging work has indicated greater activity in the posterior superior temporal sulcus (pSTS), premotor cortex, and inferior frontal gyrus for intact versus scrambled point-light stimuli, suggesting a degree of sensitivity of these regions to at least the morphological aspects of biological motion (Grossman & Blake, 2002;Saygin, 2007;Saygin et al., 2004). The only work to our knowledge attempting to isolate the neural sites responsible for perceiving biological kinematics has implicated both the STS and the human motion complex, hMTþ (Jastorff & Orban, 2009). We have recently reported both univariate and multivariate responses in a motor-thalamic area (the ventral lateral nucleus) to both biological form and kinematics (Chang et al., 2018).
While a considerable collection of fMRI work has revealed a set of loci distributed across the human brain that is engaged during the perception of biological motion (Chang et al., 2018;Grezes et al., 2001;Grossman & Blake, 2002;Jastorff & Orban, 2009;Saygin, 2007;Saygin et al., 2004), this work lacks temporal information about the responses that could be particularly interesting in light of the suggestion that certain mechanisms underlying the perception of biological motion may have particular ecological relevance. Specifically, it has been posited that mechanisms underlying sensitivity to local kinematics information may serve the detection of lifeform in the visual environment (Johnson, 2006;Troje & Westhoff, 2006). This line of thinking is driven by comparative work showing that the ability to perceive biological motion extends to non-human mammalian and non-mammalian species, including chimpanzees (Tomonaga, 2001), dolphins (Herman et al., 1990), birds (Omori, 1997;Regolin et al., 2000;Vallortigara et al., 2005), and fish (Nakayasu & Watanabe, 2014). Importantly these animals respond to stimuli beyond those of their own species. This has led to the suggestion of a visual filter tuned to detecting some species invariantcharacteristic e the motion profile exhibited by limbs of a terrestrial animal in locomotion (Johnson, 2006;Troje & Westhoff, 2006). The posited mechanisms have been suggested to be evolutionarily ancient in nature, leading to a further prediction that they should be found in more primitive parts of the nervous system. If this is the case, responses to local kinematics information may occur (temporally) earlier, than those to body form, even in regions that are sensitive to both classes of biological information (e.g., in superior temporal cortex).
To date, a body of electroencephalography (EEG) and magnetoencephalography (MEG) work has shed some light on the temporal characteristics of biological motion perception (Hirai et al, 2003(Hirai et al, , 2005Inuggi et al., 2018;Jokisch et al., 2005;Krakowski et al., 2011;Pavlidou et al., 2014;Pavlova, 2004;Pavlova et al., 2005;Safford et al., 2010;White et al., 2014). Hirai et al. (2003) presented both intact and scrambled biological motion to observers while measuring their EEG responses. In this case, the intact walker was derived from the synthetic motion algorithm of Cutting (1978) and hence carried only intact form, but was lacking an important kinematic invariant (Saunders et al., 2009). They found that the perception of both the biological and the scrambled control stimulus elicited (negative) response peaks at 200 (N200) and 240 msec (N240) post-stimulus onset in occipitotemporal cortex, although for both peaks, responses were larger for the biological motion stimulus than for the scrambled walker. Two negative response components at 180 msec, and between 230 and 360 msec were also observed by Jokisch et al. (2005) in response to both intact walkers and the scrambled stimuli, sourced towards the superior temporal region. Amplitudes were again larger for upright and inverted walkers as compared to the scrambled control stimuli. Similar early response components have been observed in other EEG work gauging modulatory effects of adaptation and attention in biological motion perception (Hirai et al., 2005;Pavlova et al., 2006;Safford et al., 2010) as well as in MEG work looking at oscillatory (gamma-and beta-band) activity (Pavlidou et al., 2014;Pavlova, 2004). Finally, in an interesting MEG study,  presented firstly a scrambled point-light walker (in an attempt to attenuate activity in early visual areas), followed by an upright or inverted intact walker, or a second scrambled stimulus. They observed two response components: a peak neuromagnetic response at 250 msec after the onset of the first stimulus, and a second response at around 350 msec after the onset of the second stimulus. Responses to the second stimulus, estimated to lie in occipitotemporal cortex, were larger for intact walkers than for the scrambled walkers. Importantly, as in the earlier EEG studies, this particular study used an intact walker that contained both global form and intact kinematics information as the primary stimulus of interest. The scrambled stimulus adaptor was an assumed control stimulus for lower-order (non-biological) features. Therefore it is difficult to understand the response characteristics of the biological kinematics response independently from that reflecting the detection of global form in the currently available literature.
Here, we are primarily interested in achieving a better understanding of the temporal profile of responses to biological motion. We used stimuli that preserves intact form information, kinematics information, or has neither intact form nor kinematics information. Using fMRI, we have shown previously that these stimuli are effective for isolating formrelated and kinematics-related responses both in and outside of cortex (Chang et al., 2018).
Stimuli were presented upright or inverted in an eventrelated configuration and observers were asked to indicate the apparent walking direction of the stimulus while neuromagnetic responses were measured concurrently. Our questions were three-fold: how early does the onset of biologicalmotion specific responses occur? While a small body of work has seemingly indicated two response waveforms to intact walkers, it is unclear as to whether these response profiles are associated with the perception of the human body structure (i.e., global information), or associated with the processing of biological kinematics (i.e., local cues), as these two types of information have often been confluenced in earlier work (Hirai et al, 2003Pavlova et al., 2005). Moreover, if responses to the two forms of information can be carefully dissociated we can make further predictions: if kinematicsrelated mechanisms are related to the rapid detection of biological entities in the environment, we may find responses to the local stimulus to be earlier than those to the global stimulus, which is instead thought to tap into a learned mechanism that responds to body form (Johnson, 2006;Troje & Westhoff, 2006).
We also asked whether multivariate pattern analysis (MVPA) can be applied for decoding temporal, biological motion data in order to tease apart the onset of structure versus kinematics discriminability. To our knowledge, no studies thus far have attempted to understand neuromagnetic responses to biological motion using multivariate pattern analysis techniques. Lastly, equipped with fMRI data obtained under identical stimulus parameters (Chang et al., 2018), we asked whether we could better understand the sources of the MEG signals by performing a representational similarity analysis (Edelman, 1998;Kriegeskorte et al., 2008), and asking whether MEG representations correlate with representations extracted with fMRI (Cichy et al., 2014(Cichy et al., , 2016.

Participants
Twenty-one observers (mean age of 24.2 years; SD 3.3 years; 19 males) participated in this study. This group size was selected following a power analysis using data from prior fMRI work (Chang et al., 2018), using identical stimuli, in order to achieve medium-to-large (d ¼ .5e.8) effect sizes. All participants had normal or corrected-to-normal vision, and provided written informed consent in line with ethical review and approval of the work by the ethics committee of the National Institute of Information and Communications Technology (NICT), Japan. Inclusion/exclusion criteria (i.e., with regards to visual screening, and magnet-compatibility) were established prior to data acquisition and analysis.

Stimuli
Stimuli were point-light walking sequences based on motion captured data (Troje, 2002). We represented this motion with a set of 11 dots showing the walker in sagittal view (facing either rightward or leftward) and subtracted overall translating motion such that walking motion was stationary as if on a treadmill. The average walker contains both form (structure from motion) and kinematics information as conveyed by horizontal and vertical asymmetries, such as acceleration. From this standard walker, we derived three types of stimuli ( Fig. 1a): a form-from-motion stimulus that we refer to hereafter as the global stimulus, a spatially "scrambled" stimulus that nonetheless retains natural kinematics information (referred to as the local natural stimulus), and a scrambled stimulus that is devoid of both form-from-motion and natural kinematics (referred to as the local modified stimulus). These stimuli were identical to those used in our previous fMRI work (Chang et al., 2018). Briefly, our global walker was constructed by keeping the spatial layout of the dot trajectories intact, but replacing the horizontal component of each trajectory by the average of that trajectory and the left and right mirror-flipped variants. The resulting dot motion is perfectly symmetric about the horizontal axis (see Fig. 1b, global, for the trajectory of the foot motion) and therefore does not contain any cue as to the facing direction of the walker. Left-right asymmetries are only retained in the structural configuration of the human body. The local natural walker was constructed by starting with the original walker and randomly reallocating the horizontal spatial positions of the dot trajectories within a region constrained to that occupied by an intact walker. This manipulation renders a walker that retains natural kinematics information but not the coherent structure of a veridical walker (Fig. 1b, local natural). Finally, the local modified walker contains neither structure-from-motion information, nor valid kinematic information (Fig. 1b, local modified). This walker was constructed by taking the original walker and again, randomly shuffling the horizontal spatial positions of the dots. For this walker, we additionally removed the natural kinematics cues by forcing each individual dot to move with constant speed along its original trajectory (with the speed chosen to be equal to the average speed of the corresponding dot-motion in the original walker (Chang & Troje, 2009a, b). This was achieved by interpolating each individual dot trajectory and computing linear arclength, thereby eliminating the vertical asymmetries caused by gravitational acceleration.
The three types of stimuli (global only, local natural, and local modified) were shown at both upright and inverted orientations. An inverted variant of the global walker constituted a complete inversion of the shape and the local trajectories. An inverted variant of each of the local natural and local modified walker type was generated by mirror-flipping each dot's trajectory about the horizontal axis, while retaining its vertical position. This retains the vertical order of the dots. For all stimuli, the starting temporal position of the gait cycle was randomly selected on each trial. Horizontal spatial shuffling (for the local natural and local modified stimuli) was also randomized on each trial. Gait frequency was .93 Hz and dots were white (153.7 cd/m 2 ) on a black background (.92 cd/m 2 ).
Stimuli were presented on a PC (HP Z230SF). Graphics rendering was implemented by a nVidia Quadro K600 graphics card set to display 1024 Â 768 pixels at 60 Hz. Stimuli were front-projected using an LCD projector (Panasonic PTDZ680) and viewed through a 45 deg tilted mirror mounted in front of the observer, who was laying supine. Stimuli subtended visual angles of 2.9 Â 6.4 deg 2 .

Task
A schematic depicting the sample flow of a single trial is presented in Fig. 1c. On each trial (event), observers were presented with a 200 msec fixation cross (red) accompanied by a static, white, square-shaped probe for the photosensor (serving as a time stamp to synchronize MEG measurements with stimulus presentation) presented on the top right corner of the screen for the initial 100 msec. This was followed by a 1000 msec fixation cross (grey), a 500 msec stimulus presentation accompanied by a white photosensor probe, a c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 2000 msec fixation cross (grey), and finally a 3000 msec fixation cross (green) that constituted the response period. Intertrial interval was uniformly sampled between 1000 and 2000 msec. Fixation crosses were .5 Â .5 deg 2 in size, and photodiode probes were 1.5 Â 1.5 deg 2 in size, projected on an occluded part of the screen and hence invisible to the observers. Observers were asked to withhold their blinking and to submit their response at the onset of the response probe (green fixation); they were asked to indicate whether the walker appeared to be facing leftwards or rightwards by pressing one of two buttons on an MEG-compatible response box (Psychology Software Tools, PST-100445).

MEG design and acquisition
An MEG acquisition run consisted of 60 trials, comprising 5 repetitions each of the three main stimulus types presented upright or inverted, and facing leftwards or rightwards. Trial order was randomized. Each run lasted approximately 8 min during which MEG was continually measured, and observers completed a minimum of 8 runs (max of 12 runs). The entire MEG session, including setup and calibration lasted approximately 2 h and participants were encouraged to take breaks in between each run. MEG data were acquired within a magnetically shielded room at the NICT CiNet imaging facility (Suita, Osaka, Japan), using a customized 360-channel whole head MEG system (Neuromag 360, Elekta). The MEG system consisted of 204 planar gradiometers, 102 magnetometers, and 54 axial gradiometers. Magnetic signals were recorded at a sampling frequency of 1000 Hz. The pairs of planar gradiometers are located at 102 positions and measure derivatives in orthogonal directions (x and y). Data from all 360 channels were used for analyses. Note that we were not interested in response differences between the different meter types and treated the responses of all sensors as patterns (after adjusting for sensor size differences in MNE). Head position with respect to the sensor array was determined by five coils attached to the forehead and preauricular points, the positions of which were defined with a 3D digitizer. Electrooculograms were acquired using two electrodes that were pasted on about 1.5 cm above and below the left eyelid. The electricity between these two electrodes was simultaneously recorded with MEG signals using additional input sockets of the MEG system. The Max-Filter package automatically rejected bad channels using an internal algorithm, while the eye movement-related electricity was also used to check signal quality artefacts by visual inspection. The MEG data acquisitions and the stimulus presentation timings were synchronized trial-by-trial using a photo-sensor placed in the upper-right corner of the screen, over a white square probe that was presented during the first 100 msec of the pre-stimulus resting period and the biological motion stimulus presentation period (500 msec).

2.5.
fMRI acquisition of anatomical images T1-weighted anatomical images (1 mm 3 ) for the MEG observers were acquired at the NICT CiNet imaging facility (Suita, Osaka, Japan), using a 3-T Siemens Trio MR scanner with a 32-channel, phase-array (whole) head coil.

fMRI experiment and data analysis
The results from the fMRI experiment, which was conducted on a distinct group of 19 observers have been reported earlier (Chang et al., 2018). Five of these observers also participated in the MEG experiment.

Behavioral data analysis
Behavioral performances were quantified in terms of the proportion of correct discrimination responses. Discrimination accuracies were analyzed with a repeated-measures analysis of variance (ANOVA) comparing the three walker types (global only, local natural, local modified) at the two orientations (upright, inverted). Data satisfied parametric assumptions, and any variance (sphericity) violations were addressed with Greenhouse-Geisser corrections. Post-hoc comparisons were conducted by means of Bonferronicorrected t tests (two-tailed).

MEG analysis
Raw MEG data were processed using MaxFilter (Elekta, Stockholm), which included corrections for head movements and applications of a temporal signal space separation (tSSS) filter (buffer length set to 10 sec) for automatic rejection of artifacts (Taulu & Simola, 2006 Figure S1. Averaged responses were baseline-corrected to the period of À200 to 0 (the 200 msec interval preceding stimulus onset). We performed three sets of analyses: source estimation (dynamic Statistical Parametric Maps, dSPM), sensor-level decoding (MVPA), and MEG-fMRI correlation of representational similarity (see Fig. 1d for a schematic of the logic across analytical procedures). Note that decodingrelated analyses were not performed with source-estimated data within parcellated regions of interest due to the sparseness of the MEG sensors representing each ROI. That is, attempting to extract patterned responses within ROIs that are represented by a limited number of sensors is unlikely to be informative. Hence multivariate analyses presented here were performed with data retained in sensor space, treating all 360 sensors as a single "ROI". While in principle, patterns could also be extracted using the grid-space on the cortical surface (see details for source analysis, below), retaining the data in the original sensor space allows us to extract patterned responses free of forward and inverse model assumptions.
Surely, while performing decoding analyses in this manner c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 removes spatial information that might be of interest, we note that we were already equipped with spatially-resolved data from previous work using identical stimuli (Chang et al., 2018), and we use these, along with the MEG data, in the final representational-similarity analysis.

MEG source analysis (dSPM)
We firstly conducted a conventional source analysis of the MEG signals in order to estimate response amplitudes across the cortical surface. The cortical surface of each observer was reconstructed from high-resolution T1-weighted MRI volumes using Freesurfer (https://surfer.nmr.mgh.harvard.edu/). The boundary element model (BEM) (Mosher et al., 1999;Munck, 1992;Oostendorp & Van, 1989) was used for forward computation. Currents were estimated at each source position using dynamic statistical parametric mapping (dSPM) (Dale et al., 2000). In dSPM, estimated currents at each source location were normalized by dividing by an estimate of the noise at that location, which can be obtained by applying the inverse operator to the signal covariance matrix (estimated from the À200 to 0 msec event window). The dSPM estimate then can be evaluated as a z-score and is a measure of the MEG signal to noise ratio (SNR) at each spatial location, which is related to neural activity. For each condition, a time series of estimated current at each source position or ROI was extracted and used for the later analyses. Further, baseline correction was performed on individual trial waveforms and on the average waveforms using the period from the À200 to 0 msec event window.
The MEG source signal estimations were done using grids spaced 5 mm apart, defined in the reconstructed surface space. Note that activity deeper in the brain cannot be wellcaptured by MEG. Individual observer data were based on analyses in which all analyses and inverse estimations were done in individual observer space. Based on previous work (Chang et al., 2018;Saygin, 2007;Saygin et al., 2004), we elected to perform analyses on eight spherical ROIs, each 5 mm in radius: six were centered on centers-of-masses retrieved from regions identified using anatomical parcellation in Freesurfer (lateral-occipital, inferior-parietal, superior-parietal, inferiortemporal, lateral-orbitofrontal, and caudal middle-frontal cortex (see Supplementary Table S1 for coordinates). Note that the caudal portion of the middle-frontal cortex was included as previous work had shown that the perception of biological motion can, under certain conditions, engage premotor cortex (Saygin, 2007;Saygin et al., 2004). An additional two ROIs, those for the middle-temporal and superiortemporal cortex were centered posteriorly on TAL -51, À52, 0 (left)/51, À49, 3 (right) for the MTþ complex, and À53.8,-60,13 (left)/44,-60,19 (right) for the superior-temporal cortex, in order to better match with historical work on biological motion perception (Grossman et al., 2000;Orban et al., 2003). Within these spherical ROIs, activity (dSPM) was extracted and region-averaged per condition, for each observer.

MEG sensor-level decoding of conditions (MVPA)
Next, we considered the data multivariately, treating wholebrain sensor-level data as response patterns. This allowed us to determine the time course with which MEG response patterns can discriminate between conditions (independent of their univariate amplitudes). We entered the data into a multivariate pattern analysis using linear support vector machine (SVM) classifiers (Chang & Lin, 2011) independently for each observer (Cichy & Pantazis, 2015). For each stimulus event from À200 to 1500 msec, we extracted preprocessed MEG data and arranged them as 360 dimensional vectors (corresponding to the 360 sensors). This yielded, for each time point, and each condition, a 360 vector pattern. To increase signal-to-noise ratio, we computed responses using a sliding window of 40 msec. This was done in order to obtain more robust spatial patterns of the 360 sensors. We also generated pattern instances by averaging randomly-sampled sets of fivestimulus events. As each subject completed a total of 8e12 runs, their MVPA analysis were conducted on a total of 16e24 'five-trial-averaged' events, per condition. The sensor timecourse data were subsequently converted to Z scores. We then used a leave-one-trial out cross-validation to train the SVM to discriminate between all possible pairwise combinations of responses for the three walker types (global, local natural, local modified) at each orientation. For each timepoint (1 msec), and each pairwise comparison, N-1 vectors from a single condition were assigned to the training set and the remaining used as the test data to assess classification performance. This process was repeated 1000 times with random assignment of training and test data. Final decoding accuracies constituted the average accuracy across all iterations. Chance level (.52), was determined via permutation tests for the data (i.e., by running 1000 SVMs with shuffled labels), and accuracies at each time point were deemed to be abovechance level if they were statistically different from this baseline as determined by a t-test versus baseline (and equivalently, if their lower 97.5% confidence bound was above this value).

MEG-fMRI correlation
Lastly, we sought to establish correspondence between the MEG responses and previously acquired fMRI data, exploiting the temporal resolution of the MEG data in order to ask whether we can better understand the temporal order of engagement of spatially-resolved ROIs obtained previously using fMRI. To facilitate comparison with the MEG data, we newly generated fMRI decoding (MVPA)-based similarity matrices for select retinotopic and extrastriate ROIs delineated (at the individual observer level) in our previous fMRI experiment (Chang et al., 2018): V1eV3, hMTþ, EBA, FBA, IFG, STS. Within each ROI, group averaged decoding accuracies for equivalent pairwise comparisons of the six walker conditions (global, local natural, local modified walker at the two orientations) were first z-scored, and then assigned to a 6 Â 6 (condition x condition) matrix. We then took the MEG decoding data at each time point and similarly assigned them to a 6 Â 6 matrix, with rows and columns indexed by stimulus conditions. MEG decoding data were similarly z-scored. Finally, as the two groups of observers were different, we performed bootstrap resampling (1000 iterations) for both the MEG and fMRI data sets to generate measures of spread. Using these data, for each time-point, we correlated (Spearman's rank correlation) the MEG decoding matrix to the fMRI decoding matrix of each ROI within a select window defined by the point of response onset, and return to asymptotic baseline, as determined through the univariate MEG data (200e1200 msec). This also ensured that any artificial correlations resulting from chance-level representations from the two data sets (as would be expected before response onset) can be omitted. This would be the case during a zero (MEG) response producing subsequent chance-level SVMs (being entered into the similarity matrix) when paired with nondiscriminative fMRI ROIs. Lastly, we deemed the correlation at each time point to be statistically significant if it differed from zero, as assessed by testing if their 99% confidence bounds overlapped with zero.

Behaviour
We firstly examined the behavioural responses to the stimuli in order to validate that the MEG presentation protocol was well able to capture clear stimulus-related behavioural effects that were evident under our previous fMRI protocol (Chang et al., 2018). Mean behavioural performance accuracies quantified in terms of the proportion of correct leftright discrimination responses across conditions are presented in Fig. 1e. Performance was highest for the global stimulus as compared to the local natural and modified stimuli, for which performance did not differ. Additionally, discrimination performance declined when stimuli were inverted. These observations are confirmed by a 2 (orientation) x 3 (walker type) repeated-measures ANOVA that indicated significant main effects of orientation We note that the behavioural results observed here are consistent with those observed in our previous work using identical stimuli (Chang et al., 2018).

MEG source analysis
We examined neuromagnetic responses in terms of their univariate (dSPM) estimations, in order to better align with previous work Pavlova et al., 2005). dSPM amplitudes (normalized to the À200 to 0 period) for the individual regions (Fig. 2a, inset), are presented separately for the three main walker types and two orientations in Fig. 2a. Note that in order to retrieve the dSPM response for a given ROI, we averaged nodes where applicable. As evident in this figure, response onset differs according to ROIs with the earliest peak response observed at~160 msec post-stimulus onset in lateral-occipital cortex and 200e250 msec in the remaining ROIs. Of particular note is that many regions appear to exhibit two characteristic peaks during the event period: an early peak between 0 and 300 msec post-stimulus onset, and a later peak that can be observed beyond 300 msec. After the second peak, responses across all ROIs decline and approach asymptote by 1200 msec poststimulus onset.
In order to capture the characteristics of both response periods, we retrieved dSPM amplitudes separately for the "early" and "late" periods, defined as the window between 0 and 300 msec for the early response, and the window between 301 and 1500 msec post-stimulus onset for the late response. Within each window, we identified the latency to the early and late peak responses independently for each condition and within each ROI. We then extracted amplitudes at these corresponding latencies. A "peak" response then, corresponds to the maximum amplitude for a particular condition, and for a particular ROI, within the specified window. The mean latencies and the peak amplitudes extracted in this manner are presented in Fig. 2b and c for the early and late responses, respectively. These latencies and amplitudes are also visualized in Fig. 3 and summarized in Supplementary  Table S2.
Latencies were entered into a full 2 (period: early/late) x 8 (ROI) x 3 (walker type) x 2 (orientation) ANOVA that indicated significant main effects of ROI [F (7, 140) ¼ 8. Overall, "late" activity of superior-parietal (~520 msec), inferior-parietal (~580 msec), and lateral-occipital (~590 msec) cortex, were significantly earlier than that in middle-temporal (640 msec) and superior-temporal (670 msec) cortex, which in turn preceded activity in caudal middle-frontal (770 msec) and lateral orbito-frontal (820 msec) cortex. In Supplementary Figure S2, we present responses for one particular condition (global, upright), overlaying all eight ROIs, in order to better visualize the order of engagement of these regions during the early and late periods. As evident in this figure, during both periods, middle-temporal and inferior-temporal cortex are engaged following lateral-occipital and parietal cortex. Importantly, for these late responses, only those in superior-temporal cortex differed across the three walker types. Specifically, for this region latencies for the global walker (481 msec) were significantly earlier than the late latencies of the local natural c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 Fig. 2 e Electromagnetic responses presented in terms of their univariate (dSPM) estimations. (a) Normalized dSPM amplitudes within the event period of ¡200 msece1500 msec are presented separately for the three main walker types and two orientations. The shaded area denotes stimulus onset and offset. Sample early and late peak positions for the "global" upright and inverted conditions are marked using black (early) and grey (late) arrows. Amplitudes were extracted within ROIs centered on centers-of-masses of individual anatomically parcellated regions (inset, sample ROIs for a observeraveraged brain). In (b) and (c), dSPM peak amplitudes are presented separately for the "early" (0e300 msec) and "late" (301e1500 msec), periods respectively. Error bars represent ± standard error of the mean.
(783 msec), and local modified stimuli (738 msec) [F (2,40) ¼ 12.5, p < .001, h 2 p ¼ .38]. The three-way period x walker type Â orientation interaction was similarly followed up by separate two-way ANOVAs for each of the early and late periods. The analysis for the early period indicated that overall latencies (across all walker types) were later for the inverted rather than upright orientation [F (1, 20) ¼ 9.1, p ¼ .007, h 2 p ¼ .31]. By contrast, the analysis for the late period indicated that latencies were later for the inverted rather than upright orientation, but only for the global walker [walker Â orientation interaction; F (2, 40) ¼ 7.0, p ¼ .002, h 2 p ¼ .26]. Finally, the ROI x walker type x orientation was followed up by separate, corrected ANOVAs for each ROI (collapsed across periods). The analysis yielded effects in:

Middle-temporal cortex
Here, latencies were earlier for the inverted versus upright orientation for the global (335 vs 469 msec) and local natural (433 vs 495 msec) walkers, but later for the inverted versus upright stimulus for the local modified walker (466 vs 402 msec) [walker Â orientation interaction; F (2, 40) ¼ 5.7,

Lateral orbito-frontal cortex
Here, latencies were earlier for the inverted (423 msec) than upright (561 msec) orientation for the global walker, but later for the inverted versus upright walker for the local natural (555 vs 484 msec) and local modified (550 vs 508 msec) stimuli [walker Â orientation interaction; F (2, 40) ¼ 8.9, p ¼ .001, A summary of these main effects, interactions, and outcomes of the follow-up comparisons are presented in Supplementary Table S3.
Amplitudes were next entered into a comparable 2 (period: early/late) x 8 (ROI) x 3 (walker type) x 2 (orientation) ANOVA that indicated significant main effects of period [F (1, 20)  The period by walker type interaction was followed up by separate, corrected, one-way ANOVAs comparing the three walker types separately for the early and late periods. The ANOVA for the early period indicated that amplitudes for all three walker types did not differ [F (2, 40) ¼ .24, p ¼ .79, h 2 p ¼ .012]. By contrast, for the late period, amplitudes for the global walker were significantly higher than those for the two local walkers [F (2,40) ¼ 3.35, p ¼ .04, h 2 p ¼ .14]. The walker type by orientation interaction was due to the fact that amplitudes were higher for the inverted than for the upright orientation, but only for the global walker [Bonferonni-adjusted t-tests, t (20) ¼ À2.7, p ¼ .015 for the global walker].
Finally, the period by ROI by orientation three-way interaction was further probed by separate two-way ANOVAs comparing amplitudes across early and late periods at the two orientations for each ROI. The analysis revealed that amplitudes of the inferior-parietal cortex were higher for the inverted orientation (while collapsed across walker types), but only during the late period [F (1,20)  , were higher in the late versus early periods, though did not differ between the two orientations. A summary of these main effects and interactions are also presented in Supplementary  Table S3.
We repeated these analyses including hemisphere as a factor to test for any laterality effects in our data. As there were no effects of hemisphere nor interactions between hemisphere and stimulus-specific responses (Supplementary Figures S3 and S4), we retained bilateral treatment of the data in all subsequent analyses.

MEG sensor-level decoding
Next, we tested the extent to which MEG responses could distinguish between stimulus conditions in terms of their response patterns (or vectors). As noted, as with Cichy et al. (2016), we retained the data in sensor space (i.e., free of forward and inverse model assumptions) and treated all 360 sensors as a single "ROI", in order to solve the problem of having insufficient nodes defining patterns within individual regions of interest. This ensured that meaningful patterned responses could be extracted. Further, retaining the data in this manner facilitates comparisons between MEG and fMRI decoding-based similarity matrices in subsequent analyses. Results from the 360-channel based SVM analysis comparing differences in pattern responses across the different stimulus conditions are presented in Fig. 4. In this figure, mean decoding accuracies and their corresponding upper and lower 97.5% bounds can be referenced with respect to shuffled baseline (52%, dashed line). As noted, accuracies at each time point were deemed above-chance level if their lower 97.5% confidence bound was above this value. We first inspected the prediction accuracies of sensor patterns for all c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 possible pairwise combinations of the upright stimuli (Fig. 4a) and observed a clear onset of pattern discriminability between the global form and kinematics (local natural) stimuli, as well as between the global form and modified stimuli beginning at 300 msec post-stimulus onset, peaking at 489e544 msec (64%, CI 95 ¼ 60e68%; and 63%, CI 95 ¼ 59e67% for the global vs local natural and global vs local modified comparisons, respectively), before returning to baseline at 1250 msec. By contrast, MEG sensor patterns between the local natural and local modified stimuli were indistinguishable. We then compared the temporal profile of sensor pattern discriminations for the inverted stimuli (Fig. 4b). Here, we observed two distinct peaks that are evident for both the global versus local natural, and global versus local modified comparisons: a first, smaller one at 100 msec (60%, CI 95 ¼ 56e64%; and 58%, CI 95 ¼ 54e62% for the global vs local natural and global vs local modified comparisons, respectively), after which it returns to baseline, followed by a second larger and more sustained peak at 270e285 msec post-stimulus onset (65%, CI 95 ¼ 60e70%; and 63%, CI 95 ¼ 58e68% for the global vs local natural and global vs local modified comparisons, respectively). SVM accuracies return to baseline again at 1250 msec after stimulus onset. As for the upright comparisons, sensor response patterns were indistinguishable between the local natural and local modified stimuli. Finally, results from SVM classifications of sensor response patterns for upright and inverted variations of the global stimulus reveals onset of discriminability at 100 msec (59%, CI 95 ¼ 54e64%), peaking at 340 msec (65%, CI 95 ¼ 60e70%) before again returning to baseline by 1250 msec. Sensor response patterns for upright and inverted variations of the local natural and local modified stimuli remained indistinguishable over the entire event period.

MEG-fMRI correlation
Finally, in order to better understand the cortical sources of the MEG signals, we performed a representational similarity analysis (Edelman, 1998;Kriegeskorte et al., 2008) in which we asked whether MEG representations could be mapped on to representations extracted with fMRI in representative retinotopic (V1eV3) and extrastriate (hMTþ, STS, EBA, FBA, IFG) regions deemed relevant in previous work (Chang et al., 2018).
To do so, we assigned results from the MEG SVM (condition) decoding analysis into a 6 (condition) x 6 (condition) matrix. Drawing upon the fMRI work reported previously using identical stimuli (Chang et al., 2018), we also assigned fMRI decoding accuracies for the select regions of interest into a 6 Â 6 matrix. For each time point, we then used Spearman's rank correlation to correlate the MEG decoding matrix to the fMRI matrix (Fig. 5a). We ran correlations (one for each of the eight fMRI ROIs) within a select window of 200e1200 msec, as this window encapsulates the point of univariate and multivariate MEG response onset, and the response's return to asymptotic baseline (Figs. 2 and 4). This was also done in order to avoid artificial correlations between the two data sets that result from a zero response, that would then lead to subsequent chance-level SVMs. This would be the case, for example, before the onset of the MEG response, but also in non-discriminative fMRI ROIs. This problem is particularly amplified by our small stimulus set that is being used to define the similarity matrix. The results (median at each time point) from this analysis are presented in Fig. 5b and c for the early and extrastriate ROIs, respectively. Note that here, in order to help mitigate for correlations run across the 8 ROIs, we deemed correlations at timepoints carrying upper and lower 99% confidence bounds that do not overlap with zero to be significant. From this figure, it is evident that MEG sensor signals can be mapped to fMRI representations at two distinct intervals: for V1 and V2, there is an onset of representational similarity between MEG and fMRI patterns by 545e555 msec (V1, r s ¼ .60; V2, r s ¼ .64) persisting until about 900 msec. By contrast, for extrastriate ROIs, and particularly EBA, FBA, onset of representational similarity arrives much later between 1083 and 1125 (EBA r s ¼ .38, FBA r s ¼ .50).

Discussion
We examined the temporal characteristics of neuromagnetic responses to biological motion stimuli that were designed to carefully isolate mechanisms related to the processing of global form and local kinematics. As in our previous work using identical stimuli (Chang et al., 2018), we found that behavioural discrimination performances were best for global stimulus as compared to the local natural and modified stimuli. Performance worsened significantly when the local stimuli, in particular, were turned upside-down. As we have previously discussed very similar behavioural data obtained with our fMRI work (Chang et al., 2018), including potential mechanisms underlying observers' ability to discriminate direction from the putatively uninformative local modified stimulus, we will not engage in further discussion about them here.
Here, we were interested in achieving a better understanding of the temporal characteristics of responses to biological motion. MEG responses were examined by means of conventional source estimation (dSPM), multivariate pattern analysis, and finally, in terms of their representational correspondence to fMRI responses acquired previously.

dSPM and the temporal characteristics of biological motion responses
We firstly noted two peaks in the univariate dSPM amplitudes, with the first occurring~200 msec post-stimulus onset, and a second peak around between 500 and 800 msec post-stimulus onset. A number of studies have shown two response components for intact and scrambled biological motion stimuli (e.g., Hirai et al., 2005Hirai et al., , 2003Pavlova et al., 2005), although it has been unclear from these previous studies as to whether those components were responding to the morphological shape of the stimulus, or kinematics information, as such information has been regularly confluenced in the intact stimulus. Our univariate data, although somewhat different from this previous work in that our late component is significantly later than that previously reported , suggest that form-only stimuli (i.e., stimuli that are devoid of kinematics information) are sufficient to elicit both of these component responses. Critically, a comparison of dSPM amplitudes estimated for our regions of interest covering well-c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 reported areas for biological motion: lateral-occipital, middletemporal, superior-temporal, inferior-parietal, superiorparietal, inferior-temporal, lateral orbito-frontal, and the caudal middle-frontal cortex allowed us to reveal the finer order of engagement of each of these areas. Particularly, during both response periods, responses in early (lateral-occipital) and parietal cortex preceded those in temporal cortex, activity for which preceded that in middle-frontal and orbitofrontal cortex. Response latencies varied to some degree with stimulus configuration, particularly in the oft-implicated superior-temporal cortex by the late period. Specifically, responses within the late period were significantly earlier for the global stimulus than for the two local stimuli. Latencies in this region were also orientation-dependent, such that they were earlier for the inverted than upright stimulus, and also earlier for the global than the two local stimuli.
In terms of dSPM amplitudes, responses for global stimuli (but not the two local stimuli) were notably more pronounced for inverted than upright conditions. Crucially, we found that amplitudes for the global walker were significantly higher than those for the local natural and modified walkers, but only in the late period. These findings of stimulus selectivity emerging in the later period, at first glance, appear to suggest that it is the second component that is reflecting the recognition of biological form. This conclusion would align with previous suggestions that the initial response component reflects lower-order motion processing, while the second response component may reflect higher-order perception of biological motion , although, again, we note that while our early latency is in line with the first component reported by , our late component is still significantly later than the second component (~400 msec) reported by these authors. Nonetheless, our MVPA results put pause to this apparent distinction between the functional relevance of the early versus late responses.

4.2.
Multivariate responses e the significance of the early response While the MVPA results for the upright stimuli (Fig. 4a), also seem to point to the significance of a later (post-300 msec) response period, indicating that the form only stimulus could be reliably discriminated from the kinematics (local natural), and modified stimuli beginning at 300 msec post-stimulus onset (i.e., during the late period), SVM comparisons of the inverted stimuli, as well as comparisons between the orientations within each class of stimuli indicate two distinct peaks of pattern discriminability. The first occurs at 100 msec poststimulus onset for the global versus local natural (and local modified) stimuli, after which accuracies return to baseline. This is followed by a second, sustained peak at around Fig. 5 e Representational-similarity-based correlation between MEG and fMRI responses. (a) We assigned both MEG and fMRI results from the SVM (condition) decoding analysis into a 6 (condition) x 6 (condition) matrix. For each time point, we then used Spearman's rank correlation to correlate the MEG decoding matrix to the fMRI matrix. Correlations were run within a select portion of the event window defined by the point of response onset, and return to asymptotic baseline, as determined through the univariate and multivariate MEG data (200e1200 msec), in order to avoid artificial correlations of the two matrices before response onset due to chance representations. Median correlations are presented along with upper and lower 99% confidence bounds (shaded regions) for early (b), and extrastriate (c) ROIs. 300 msec post-stimulus (Fig. 4b). SVM classifications comparing response patterns for the two orientations (upright and inverted) within each class of stimuli show similarly an early onset of pattern discriminability, peaking firstly at 100 msec before returning shortly to baseline, and a second onset of pattern discriminability, peaking at 300 msec (Fig. 4c).
These results are revealing in two ways. Firstly, in all cases, classifications of response patterns between the local natural and modified stimuli, as well as between responses for the upright and inverted variations of each of the local natural and modified stimulus, were indistinguishable from chance across the entire event period. This suggests that neuromagnetic patterned responses measured here better reflect the presence or absence of global form in the stimulus. The question that follows then, is what underlies human sensitivity to local kinematics. Notably, using fMRI, both we (Chang et al., 2018) and others (Jastorff & Orban, 2009) have shown cortical sensitivity to local kinematics in areas such as hMTþ, pSTS, and even as deep as in the motor thalamus. While we cannot attain reliable measures of deeper (subcortical) responses with MEG, it is also possible that cortical response patterns specific to biological kinematics are more focal, and thus unable to be adequately reflected in the sparse and superficial sensor-sampling of MEG. Secondly, it is possible that the sources of response patterns specific to biological kinematics are simply more radial (e.g., top of cortical gyri), rather than tangential (e.g., sulci), which would render them more difficult to be measured via MEG (Cohen & Cuffin, 1983). Thirdly, it is of course entirely possible that due to the inherent differences in signals being measured (with fMRI measuring blood-oxygen-level-dependent signals, and MEG more closely reflecting absolute neuronal activity), one or the other is a more sensitive method for elucidating conditionspecific responses.
The second insight that can be gained from our data is that biological motion perception is likely reflected by an even earlier response than that previously established in the human. Previous work has either found a lack of stimulusspecific responses in the early component , or a later emergence of biological motion-specific responses at 170e200 msec (Hirai et al., 2003;Jokisch et al., 2005). While on their own, our earlier dSPM results would suggest a lack of stimulus-specific responses in the early response component, results from our multivariate treatment of the data, however, appear to suggest that the onset of biological motion-specific responses occurs much earlier, as soon as 100 msec post-stimulus onset, highlighting the need to move beyond simple univariate considerations of biological motion responses. This 100 msec 'latency' well corresponds to the latency of responses reported in neurons of the macaque superior temporal polysensory (STP) area upon the viewing of biological motion (Oram & Perrett, 1994). We can rule out the possibility that this early discrimination is driven by responses to low-level, biologically-irrelevant differences in the stimuli as curiously, this early peak is only evident in comparisons of the three stimulus classes when all are presented upside-down, or, in comparisons of upright and inverted variations of the global stimulus. It is not evident in comparisons of the three stimulus classes when the stimuli are all held upright. Any lower-order differences in the stimuli are of course the same regardless of whether the stimuli are held upright or inverted. The particularly pronounced peak observed with the upside-down stimuli (Fig. 4b) well matches the exaggerated differences in behavioural performance between the three classes of stimuli ( Fig. 1) when stimuli are upside-down, rather than upright. Finally, it is worth mentioning that previous work in adolescents with Periventricular Leukomalacia, a brain anomaly in individuals born prematurely, has indicated attenuations in responses over right parietal cortex as early as 140 msec, that match behavioural deficits during a biological motion one-back task (Pavlova et al., 2006).

4.3.
MEG-fMRI correlations: relevance of early versus extrastriate cortex Equipped with data acquired previously in the fMRI using identical stimuli (Chang et al., 2018), we lastly sought to probe the cortical sources underlying our multivariate (sensorpattern) MEG signals. To do so, we set the MEG and fMRI data in a common space, defining SVM accuracies in a representational similarity matrix. The logic is that SVM representations in spatially resolved regions obtained in the fMRI should better correlate with sensor-level MEG SVM representations at various timepoints depending on their relevance (Cichy et al., 2014(Cichy et al., , 2016. Strikingly, we found a clear dissociation in terms of the onset of representational correspondence between early (V1eV2) and extrastriate (EBA, FBA) regions. Particularly, for V1 and V2, representational correspondence is attained bỹ 550 msec, persisting until 900 msec. By contrast, in extrastriate EBA, and FBA, onset of representational correspondence is only attained at roughly~1100 msec post-stimulus onset (or, 600 msec after stimulus offset) lasting only transiently (~100 msec). It is important to note that this late correspondence is yet still later than the 'late' response observed from the univariate responses. These findings align with our earlier findings of the persistence of multivariate responses and suggest further that higher-order body areas, following the extraction of biological morphology or features, perhaps continue to be engaged for action understanding. EBA and FBA have been previously reported to be selective for intact versus scrambled biological motion (Grossman & Blake, 2002;Jastorff & Orban, 2009;Peelen et al., 2006). EBA responses in particular have been suggested to reflect the integration of form and motion information (Jastorff & Orban, 2009), or to encode motion (along occipitotemporal neurons in hMTþ) in parallel with coarsely-defined posture information (Thompson et al., 2005). By contrast, FBA is thought to encode more detailed information of body morphology and posture (Thompson & Baccus, 2012). Importantly, EBA and FBA have been more recently discussed in relation to 'context integration', and are posited to aid in action understanding through providing continual updates to a larger network that compares against expectations and semantic associations (Amoruso et al., 2011). The late engagement of these higher-order body areas observed here may relate to top-down influence from this c o r t e x 1 3 6 ( 2 0 2 1 ) 1 2 4 e1 3 9 network that attempts to tag significance to the individual events.

Summary
We carefully teased apart neuromagnetic responses to biological form and kinematics. Our data yield a number of insights into the spatiotemporal characteristics of biological motion processing: We found firstly two univariate response components (dSPM) to biological motion stimuli. Our first "early" component occurred around 200 msec post-stimulus onset, and the second "late" component occurred between 500 and 800 msec post-stimulus onset. For both components, responses proceeded in a feed forward manner with responses in early (lateral-occipital) and parietal cortex preceding those in temporal and frontal cortex. Moreover, while univariate responses (both in terms of latencies and amplitudes) demonstrate biological motion specificity in the late period, multivariate patterns specific to form can be well discriminated from local cues as early as 100 msec after stimulus onset. This marks a substantially earlier emergence of biological motion representations in the human brain than previously reported (Hirai et al., 2003;Jokisch et al., 2005), and aligns with previous physiological work in the macaque (Oram & Perrett, 1994). Finally, by fusing MEG and previously acquired fMRI data using representational similarity correlations, we establish that MEG (pattern) signals are best reflected by early visual regions (V1,V2) early in encoding, and by extrastriate regions (EBA, FBA) later, perhaps during a period that integrates body features and morphology against expectations and semantics. Certainly, our findings leave yet some questions unanswered. Notably, from our data, we were unable to distinguish between responses for the local natural and modified stimuli, signaling the presence or absence of biological kinematics information, respectively. This perhaps reflects insensitivity of the MEG signals to the very subtle differences present in the otherwise very perceptually similar stimuli used here. Alternatively, it is possible that mechanisms underlying extracting biological kinematics may be located in even deeper regions of the brain that cannot be well accessed by MEG. Further, due to the independence of the MEG-MRI data sets (i.e., obtained in different subjects), it is possible that we have missed still other biological motion-relevant correspondences. It is important to remember that although the various approaches probed here ask different questions about responsivity (dSPM), pattern-specificity (MVPA), and stimulusrepresentational correspondences (RSA), they also differ in some key ways: our MVPA treatment of the MEG data considers the whole-brain patterned responses, while the dSPM (and even the RSA-fMRI) responses are retrieved from finer ROIs. While the effects of all of these issues will require further empirical validation, our findings suggest that multivariate considerations of the data constitute a powerful approach for elucidating subtleties (i.e., even earlier and later responses) in the spatiotemporal profile of biological motion responses.