Stimulus Onset Asynchrony Affects Weighting-related Event-related Spectral Power in Self-motion Perception

Abstract Self-motion perception relies primarily on the integration of the visual, vestibular, proprioceptive, and somatosensory systems. There is a gap in understanding how a temporal lag between visual and vestibular motion cues affects visual–vestibular weighting during self-motion perception. The beta band is an index of visual–vestibular weighting, in that robust beta event-related synchronization (ERS) is associated with visual weighting bias, and robust beta event-related desynchronization is associated with vestibular weighting bias. The present study examined modulation of event-related spectral power during a heading judgment task in which participants attended to either visual (optic flow) or physical (inertial cues stimulating the vestibular, proprioceptive and somatosensory systems) motion cues from a motion simulator mounted on a MOOG Stewart Platform. The temporal lag between the onset of visual and physical motion cues was manipulated to produce three lag conditions: simultaneous onset, visual before physical motion onset, and physical before visual motion onset. There were two main findings. First, we demonstrated that when the attended motion cue was presented before an ignored cue, the power of beta associated with the attended modality was greater than when visual–vestibular cues were presented simultaneously or when the ignored cue was presented first. This was the case for beta ERS when the visual-motion cue was attended to, and beta event-related desynchronization when the physical-motion cue was attended to. Second, we tested whether the power of feature-binding gamma ERS (demonstrated in audiovisual and visual–tactile integration studies) increased when the visual–vestibular cues were presented simultaneously versus with temporal asynchrony. We did not observe an increase in gamma ERS when cues were presented simultaneously, suggesting that electrophysiological markers of visual–vestibular binding differ from markers of audiovisual and visual–tactile integration. All event-related spectral power reported in this study were generated from dipoles projecting from the left and right motor areas, based on the results of Measure Projection Analysis.


INTRODUCTION
The visual, vestibular, proprioceptive, and somatosensory systems collect information about how an organism moves through its environment, and integrate this information in associated brain areas, such as medial superior temporal area and ventral intraparietal area (for a review, see DeAngelis & Angelaki, 2012), to produce a smooth, unified perception of self-motion. One complicating factor in this integration process is that each of these cues to motion is perceived on different timelines. For example, self-motion information from the visual system is perceived faster than self-motion information from the vestibular system (e.g., RTs are ∼220 msec for light and ∼440 msec for galvanic vestibular stimulation; Barnett-Cowan & Harris, 2009); however, our perception of self-motion is a function of multisensory integration. Understanding how the temporal factors of visual and vestibular perception affect multisensory integration has been of interest to researchers in many fields of science and engineering. For example, understanding this construct has been a major focus for transfer of training research and for setting policies by flight training administration authorities.
Given the different temporal trajectories of information processing between sensory systems, the temporal integration of multisensory stimuli has long been of interest to researchers. For example, in audiovisual integration, direction-incongruent stimuli give rise to the ventriloquist effect, in which the two stimuli are perceived as having the same source despite a spatially separated origin (Alais & Burr, 2004). This effect disappears when the synchrony of the audiovisual stimuli exceeds ∼300 msec (Slutsky & Recanzone, 2001). We still do not fully understand the potential effect of temporal asynchrony on visualvestibular integration and self-motion perception, especially in the context of driving and flight motion-simulator research. However, a recent study demonstrated that changes in the velocity of a visual or physical self-motion cue are most quickly detected when the stimuli are aligned, compared with a 100-msec timing difference (Kenney et al., 2020). Moreover, Rodriguez and Crane (2021) demonstrated that visual-inertial (e.g., visualvestibular) heading perception is also sensitive to temporal misalignments of less than 250 msec between the motion cues.
Multisensory integration is also affected by attention allocation (Macaluso et al., 2016). Attention can be voluntarily allocated toward a stimulus, a sensory modality, or a specific region of space to achieve task goals (Li, Piëch, & Gilbert, 2004). However, processing can also be involuntarily captured by sensory events, even when the attention capturing signals are unrelated to the current goaldirected activity (Öhman, Flykt, & Esteves, 2001). EEG is a useful tool to explore the online processes related to the interaction between attention and multisensory integration. The high temporal resolution of EEG has been effective in testing hypotheses related to synchronization of neural oscillations as a mechanism for the integration of information across sensory modalities (Senkowski, Schneider, Foxe, & Engel, 2008). Synchronization of neural oscillations (event-related spectral power [ERSP]) is quantified by measuring power of event-related synchronizations (ERSs) and desynchronizations (ERDs) within particular frequency bands (e.g., theta, alpha, beta, gamma). One hypothesis about interpretation of neural oscillations is that distinct spectral timelines index different local cortical networks involved in sensory processing, attention allocation, and multisensory integration (Siegel, Donner, & Engel, 2012). Most studies that support the spectral timelines hypothesis are based on audiovisual or visuotactile integration (for a review, see Keil & Senkowski, 2018). For example, Senkowski, Talsma, Grigutsch, Herrmann, and Woldorff (2007) showed that the closer in time the audiovisual stimuli were presented together, the more feature binding-related gamma ERS was elicited early after stimulus onset. This finding also supports Singer and Gray's (1995) temporal correlation hypothesis, which suggests that oscillations within the gamma band facilitate integration across sensory modalities. As far as we know, there are few published studies exploring how the onset timing of multisensory stimuli affects EEG correlates of visual-vestibular integration.
Townsend, Legere, O'Malley, von Mohrenschildt, and Shedden (2019) used a high-fidelity motion simulator and a high-density EEG array to observe ERSP in response to simultaneous-onset visual-and physical-motion stimuli. To examine the effect of attention allocation to visual versus physical motion, in a blocked design, participants made heading judgments to visual (or physical) cues only, while ignoring the other modality. For each trial, headings of the motion cues were either spatially congruent (e.g., heading was the same for visual and physical) or incongruent (e.g., visual and physical headings differed). Importantly, in all conditions, the visual and physical cues to self-motion were presented simultaneously. Measure Projection Analysis (MPA) identified cortical regions in the premotor and sensory motor areas (Brodmann's areas [BAs] 6 and 4) associated with motor processing. ERSP analysis within these areas revealed sensitivity of theta-(4-7 Hz), alpha-(8-12 Hz), and beta-(13-30 Hz) band oscillations to attended visual versus physical self-motion stimuli. Specifically, attending to the visual-motion stimulus (while ignoring the physical-motion stimulus) evoked earlier theta ERS and alpha ERD, whereas attention to the physical-motion stimulus (while ignoring the visualmotion stimulus) evoked longer-lasting and more powerful beta ERD. Complimentary research suggests that theta ERS is an index of heading processing (Townsend, Legere, von Mohrenschildt, & Shedden, 2022; for a review, see Buzsáki & Moser, 2013), and alpha ERD/ERS is associated with focal attention and cognitive load (for a review, see Klimesch, 2012). Most important for the present article, previous research has indicated that beta ERD/ ERS indexed visual-vestibular weighting (Townsend et al., 2019(Townsend et al., , 2022. For example, when attention was focused on the visual-motion stimulus (while ignoring physicalmotion cues), beta ERS was stronger, whereas when attention was focused on the physical-motion stimulus (while ignoring visual-motion cues), beta ERD was stronger (Townsend et al., 2019). The purpose of the present article was to further examine visual-vestibular weighting by manipulating the timing of onset of the self-motion cues.
Previous research has demonstrated that the beta band is an index of visual-vestibular weighting, and that attention allocation plays a key role in how weighting is distributed among multisensory inputs (Townsend et al., 2019(Townsend et al., , 2022. Those studies, however, did not investigate the impact stimulus onset timing has on the process of visual-vestibular weighting within self-motion perception. Previous research has shown that discrepancies in the onset timing of audiovisual stimuli can affect multisensory weighting (Fister, Stevenson, Nidiffer, Barnett, & Wallace, 2016;Sheppard, Raposo, & Churchland, 2013). We need a better understanding about how the interaction of attention allocation and temporal misalignment affect the underlying cortical activity associated with visualvestibular integration during self-motion perception. The goals of the present study were twofold. The first goal was to examine the effect of attention allocation and temporal asynchrony on induced ERSP, specifically the power and time course of beta oscillations associated with visualvestibular weighting. The second goal was to examine induced gamma oscillations. Previous multisensory research (e.g., Senkowski et al., 2007) demonstrated more powerful feature-binding gamma ERS when audiovisual multisensory cue onsets were presented closer in time. The present study extends this work by asking whether feature-binding reflected by gamma ERS is similar for visual-vestibular integration.
Participants attended to either physical (ignoring visual) or visual (ignoring physical) motion cues (blocked design) and discriminated between left and right self-motion headings (random presentation within a block). There were three SOA conditions: (1) visual motion onset 100 msec before physical motion onset, (2) physical motion onset 100 msec before visual motion onset, and (3) simultaneous visual and physical motion onset. Given previous research (Townsend et al., 2019(Townsend et al., , 2022, we hypothesized that beta ERD would be most powerful when participants attended to the physical-motion cues, and beta ERS would be most powerful when participants attended to visualmotion cues. This pattern, however, would be modulated by the temporal lag conditions, such that beta ERD in response to attention to physical motion would be enhanced if the attended physical-motion cue was presented before the ignored visual-motion cue, and beta ERS in response to attention to visual motion would be enhanced if the attended visual-motion cue was presented before the ignored physical-motion cue. Moreover, if gamma ERS is most powerful during conditions of temporal synchrony (Senkowski et al., 2007), the present study may provide evidence that gamma ERS is an index of general processes related to multisensory binding and integration across multiple sensory systems. If this is not the case, feature binding-related gamma ERS may only be specific to processes such as audiovisual and visualtactile integration.

Participants
Thirty-six participants (20 women) were recruited from the McMaster University psychology participant pool and the McMaster community. The sample size was sufficient based on a power analysis of data from our previous study (Townsend et al., 2019; 37 sample size, 0.73 effect size, 0.05 error probability, 0.95 power, four measurements) conducted by G*Power Software (Faul, Erdfelder, Buchner, & Lang, 2009). Ages ranged from 17 to 23 years (M = 18 years, SD = 1.30 years). Those recruited from the participant pool were compensated with course credits. All participants self-reported normal or corrected-tonormal visual acuity and reported no major problems with vertigo, motion sickness, or claustrophobia. This experiment was approved by the Hamilton Integrated Research Ethics Board and complied with the Canadian tri-council policy on ethics.

Visual Motion Stimuli
Visual motion stimuli were presented on a 43-in. LCD panel, 51 in. in front of the participant, subtending a visual angle of 41°. The panel had a refresh rate of 60 Hz and a resolution of 1920 × 1080 (1080p).
The visual display, which contributed to the perception of self-motion, was composed of a fixation cross in the center of the display and two tracks on a gray surface. Each track consisted of a series of yellow dashes perpendicular to the length of the track, drawn in perspective to a vanishing point so that the track appeared to extend into the distance. One track veered right, whereas the other veered left, both at 35°, starting at the lower center of the display. Both tracks together subtended a horizontal visual angle of 33.69°. A horizon line was created by a gray surface upon which the tracks laid, and a blue sky with white clouds above, accentuating the perception of traveling along a track into the distance. The perception of self-motion along the track was created via a first-person viewpoint animation that simulated a forward trajectory to align with the acceleration and perceived velocity that result from the physical-motion cues (see Figure 1B and C for two temporal snapshots). The duration of the visual-motion stimulus on each trial was 700 msec, which included a 200-msec acceleration period followed by 500 msec at a fixed velocity. This was followed by a 960-msec pause in the final position at the end of the track. At the completion of the trial (1660 msec), the visual stimulus was reset to the starting position of the tracks.

Physical Motion Stimuli
A motion simulator provided physical-motion stimuli. The motion simulator cabin was supported by a MOOG Stewart platform with six-degrees-of-freedom motion (Moog series 6DOF2000E). Participants were seated in a bucket-style car seat fixed to the cabin floor.
Each physical-motion stimulus consisted of the cabin moving in a forward linear translation, 35°left or right for 330 msec at 0.01 g. This forward acceleration was presented as a precomputed parabolic movement of the platform. This surge was followed by a corresponding 1330 msec washout (see Figure 1A). During the washout period, the cabin is slowly moved to the original position below threshold for detecting the direction of movement. Figure 1A also illustrates motion noise above 60 Hz, which is because of mechanical vibrations of the simulator. We also presented very small movements in random directions other than the forward motion that simulated the feel and sound of wheels on the road, and which also helped to mask mechanical vibrations and direction of washout motion. As can be seen in the figure, the mechanical vibrations and injected noise have very low energy, which is experienced as a rumbling accompanying the perception of forward motion. The acceleration intensity was selected based on preliminary testing to achieve a clear perception of forward motion within the spatial restrictions of the movement of the platform while minimizing compensating movements of the head, neck, or upper body (Townsend et al., 2019). Physical forward accelerations were well above vestibular thresholds of .009 g, as discussed by Kingma (2005). The motion force, s(t), was described by: where t represents time in seconds, t p represents present time, t b represents the breakpoint, and t e represents the end time. A 1 describes the initial forward acceleration, −A 2 describes the initial (backward) acceleration of the washout, and A 2 describes the deceleration of the washout. Acceleration was measured using an Endevco accelerometer (model number 752A13), calibrated to approximately 1-mV/g sensitivity.

Procedure
The entire session was between 1.5 and 2 hr in duration. The timeline of the session included collection of demographic information, followed by completion of one practice block (30 trials; ∼2 min), application of EEG electrodes (25 min), completion of four experimental blocks (60 min), and participant clean up and debriefing (15 min). There were 796 experimental trials divided into four blocks of 199 trials each. Participants fixated on the fixation cross for the duration of each trial; a blink break was provided every 15 trials. The attend-visual (AV) and attend-physical (AP) tasks were blocked to avoid task switching effects. The task required participants to direct attention to the visual-motion stimulus and ignore the physical-motion cues (AV task) or to direct attention to the physical-motion stimulus and ignore the visual-motion cues (AP task). They responded with a button press to indicate whether the direction of the attended-modality motion was left or right heading.
Given the importance of collecting enough clean data with correct responses in each attention condition for EEG analyses, and given that participants have a more difficult time ignoring the visual while attending the physical stimulus (Townsend et al., 2019), we collected three AP blocks compared with one AV block. Presentation order was controlled so that the AV block was presented as the ; the high-frequency component represents the high sensitivity of the accelerometer (sensitive to 0.0001 g, sampling at 1 Kh). Note that the frequencies above 60 Hz represent mechanical vibrations of the motion system and simulator. The x axis represents time, and the y axis represents acceleration (g = m/sec 2 ). The acceleration profile is similar for 35°left and 35°right physical-motion trials. (B) The visual display before the onset of motion; at this point, the participant does not know whether visual motion will indicate travel along the left or right track. (C) A still screen capture of the dynamic visual motion display at approximately 1 sec after visual onset of a left visual motion trial. first, second, or third of the four blocks. Moreover, to ensure that participants maintained attention to the intended modality (especially during AP blocks), each block contained eight catch trials in which the ignored modality heading was incongruent with the attended modality heading.
SOA was manipulated to produce simultaneous (S), visual-first ( V1st), and physical-first (P1st) conditions. In the simultaneous condition, visual and physical motion cues were onset at the same time. In the V1st condition, the visual motion stimulus was onset 100 msec before the physical motion, and in the P1st condition, the physical motion stimulus began 100 msec before the visual motion. The duration of 100 msec was selected as the SOA based on previous research that demonstrated a window in which temporal alignment of visual-vestibular cues speeds up the perception of self-motion (Kenney et al., 2020;O'Malley, Townsend, von Mohrenschildt, & Shedden, 2015). This research provided evidence that a temporal misalignment of 100 msec delayed the responses to the self-motion cues, relative to visual-vestibular cues that were closer in temporal alignment. Thus, the benefits of multisensory integration were weakened, which was the case regardless of which motion cue was being attended. There were an equal number of left and right heading trials in each block, randomly presented.

EEG Data Acquisition
EEG data were collected using the BioSemi ActiveTwo electrophysiological system (www.biosemi.com) with 128 sintered Ag/AgCl scalp electrodes. Four additional electrodes recorded eye movements (two placed laterally from the outer canthi and two below the eyes on the upper cheeks). Continuous signals were recorded using an open pass band from direct current to 150 Hz and digitized at 1024 Hz.

EEG Preprocessing
All processing was performed in MATLAB 2014a (The MathWorks) using functions from EEGLAB (Delorme & Makeig, 2004) on the Shared Hierarchical Academic Research Computing Network (www.sharcnet.ca). EEG data were band-pass filtered between 1 and 50 Hz, and epoched from 1000 msec prestimulus to 2000 msec poststimulus. Each epoch was baseline corrected using the whole-epoch mean (Groppe, Makeig, & Kutas, 2009). Channels with a standard deviation exceeding 200 μV were interpolated after referencing (on average, 0.97 channels interpolated per participant, with a total of 35 channels interpolated). Bad epochs were rejected if they had voltage spikes exceeding 500 μV or violated EEGLAB's joint probability functions (Delorme, Sejnowski, & Makeig, 2007).
Single-subject EEG data were submitted to an extended adaptive mixture independent component (IC) analysis (Palmer, Kreutz-Delgado, & Makeig, 2012) with an n − (1 + interpolated channels) principal components analysis reduction (Makeig, Bell, Jung, & Sejnowski, 1995). Decomposing an EEG signal into ICs allows for analysis of each individual signal produced by the brain that would otherwise be indistinguishable. Dipoles were then fit to each IC using the fieldtrip plugin for EEGLAB following adaptive mixture IC analysis (Oostenveld, Fries, Maris, & Schoffelen, 2011). ICs for which dipoles were located outside the brain, or explained less than 85% of the weight variance, were excluded from further analysis. On average, 20.47 ICs per participant were excluded from analysis.

ERSP Measure Projection Analysis
ERSP was computed for each of the remaining ICs. Fifty log-spaced frequencies between 3 and 50 Hz were computed, with three cycles per wavelet at the lowest frequency up to 25 at the highest. MPA was used to cluster ICs across participants using the Measure Projection Toolbox for MATLAB (Bigdely-Shamlo, Mullen, Kreutz-Delgado, & Makeig, 2013). MPA is a method of categorizing the location and consistency of EEG measures, such as ERSP, across single-subject data into 3-D domains. Each domain is a subset of ICs that are identified as having spatially similar dipole models, as well as similar cortical activity (measure-similarity). MPA fits the selected ICs into a 3-D model of the brain, composed of a cubic space grid with 8-mm spacing according to normalized Montreal Neurological Institute space. The MPA toolbox identified cortical regions of interest by incorporating the probabilistic atlas of human cortical structures provided by the Laboratory of Neuroimaging project (Shattuck et al., 2008). Voxels that fell outside of the brain model (muscle artifacts, etc.) were excluded from the analysis.
We then calculated local convergence values, using an algorithm based on Bigdely-Shamlo et al. (2013), which deals with the multiple comparisons problem. Local convergence calculates the measure-similarity of dipoles within a given domain and compares them with randomized dipoles. A pairwise IC similarity matrix was created by estimating the signed mutual information between IC-pair ERSP measure vectors, assuming a Gaussian distribution, to compare dipoles. As explained in detail by Bigdely-Shamlo et al. (2013), signed mutual information was estimated to improve the spatial smoothness of the obtained MPA significance value beyond determining similarity of dipoles through correlation. Bootstrap statistics were used to obtain a significance threshold for convergence at each location of our 3-D brain model. Following past literature, we set the raw voxel significance threshold to p < .001 (Chung, Ofori, Misra, Hess, & Vaillancourt, 2017;Bigdely-Shamlo et al., 2013).
Our analyses focused on two relevant domains: the right motor area, with the greatest concentration of dipoles consistent with right premotor and SMA (BA 6), and the left motor area, with the greatest concentration of dipoles consistent with left premotor and SMA (BA 6). For the right motor area, each participant contributed, on average, 2.33 (±1.53) ICs, with each participant contributing at least one IC, with a range from 1-7 ICs. For the left motor area, each participant contributed, on average, 2.19 (±1.51) ICs. There were five participants who did not contribute to this domain. The range of contributed ICs was 0-6.
ERSPs were computed for each experimental condition within each domain calculated by MPA. Bootstrap statistics were used to assess differences in ERSP between conditions to uncover main effects of task and SOA. Differences at each power band were computed by projecting the ERSP for each condition to each voxel in the domain. This projection was weighted by dipole density per voxel and then normalized by the total domain voxel density for each participant. Analysis of projected source measures were separated into discrete spatial domains by thresholdbased affinity propagation clustering based on a similarity matrix of pairwise correlations between ERSP measure values for each position. Following Chung et al. (2017), we used the maximal exemplar-pair similarity, which ranges from 0-10 to set a value of 0.8 (Chung et al., 2017;Ofori, Coombes, & Vaillancourt, 2015;Bigdely-Shamlo et al., 2013).

Behavioral Results
Behavioral data were analyzed with two 2 × 3 repeatedmeasures ANOVAs for measures of judgment accuracy and RT. Outliers were defined as trials with RTs greater than 3 SDs above or below the mean in each condition and were eliminated from all further analyses. The Greenhouse-Geisser correction was applied to all effects that violated Mauchley's test of sphericity. All behavioral results are illustrated in Figure 2.

Response Time
Participants were faster at discriminating direction in the with simultaneous (AV(S)) trials (M = 1020 msec, SE = 90.19; p < .001), which were in turn faster than physicalfirst (AV(P 1st )) trials (M = 1135 msec, SE = 88.36; p < .001). Likewise, during the attend-physical task, responses were faster for the AP( V 1st ) trials (M = 1269 msec, SE = 77.69) compared with simultaneous (AP(S)) trials (M = 1406 msec, SE = 79.04; p < .001), which were in turn faster than AP(P 1st ) trials (M = 1552 msec, SE = 80.12; p < .001). Thus, two important observations are that (1) participants are faster overall when attending to visual motion, but importantly, (2) both attend-visual and attend-physical conditions are highly sensitive to which stimulus was presented first. Exploring the ERSP results provides insights into how the temporal order of stimuli may be affecting multisensory integration and thus leading to differences in accuracy and RTs.

Oscillatory Power
Effects of SOA in Attend-Visual Task Figure 3 presents a comparison of the left and right motor areas to illustrate the effect of the timing of the stimulus onset on the cortical activity during the attend-visual conditions in both MPA domains. All ERSP represents a difference in oscillatory power compared with baseline (pretrial) cortical activity, where an ERS represents more spectral power than baseline and an ERD represents less spectral power than baseline. The 1000-msec baseline EEG was recorded during the ISI before each trial, while the simulator was stationary and participants were fixating on the fixation cross. Figure 3A shows the left motor area, with the highest dipole density in the premotor and SMA (BA 6), and Figure 3D shows the right motor area, with the highest dipole density in the premotor and SMA (BA 6). In Panel B (left motor) and E (right motor), we show the associated ERSP plots for the AV( V 1st ), AV(S), and AV(P 1st ) conditions. The ERSP plots are followed by bootstrapped comparisons (α = .05) between each possible pair of conditions for left (Panel C) and right (Panel F) motor areas. The following sections will describe observations of the activity changes associated with experimental conditions across frequency bands theta, alpha, beta, and gamma. All of the comparisons outlined in the following sections were significant at p < .05.
Theta-band latency differences. The AV(P 1st ) condition elicited theta ERS significantly later than the AV(S) and AV( V 1st ) conditions. Specifically, in both the left and right motor areas (Panels C and F, respectively), AV(S) elicited greater theta ERS from ∼100 msec to 200 msec post stimulus and AV(P 1st ) elicited greater theta ERS later in the trial, from ∼500 msec to 950 msec post stimulus. Likewise, AV(V 1st ) elicited greater theta ERS from stimulus onset to 300 msec post stimulus and AV(P 1st ) elicited greater theta ERS from ∼500 msec to 1000 msec post stimulus.
Beta-band power differences. Much like the results in the alpha band, we found that the earlier the physical motion was presented, the stronger the elicited beta-band ERD power. In the left and right motor areas (C and F, respectively), AV(P 1st ) elicited the strongest beta ERD, compared with AV(S) (∼500-1500 msec poststimulus) and AV(V 1st ) (∼400-1500 msec poststimulus), and AV(S) elicited stronger alpha ERD than AV(V 1st ) (∼300-1000 msec poststimulus). Thus, in general, beta ERD AV(P 1st ) > AV(S) > AV(V 1st ).
Gamma-band power differences. AV( V 1st ) elicited a more powerful gamma ERS than AV(P 1st ) from ∼600-1200 msec poststimulus in the right motor area (F). Figure 4 presents a comparison of the same left and right motor areas as Figure 3 to illustrate the effect of stimulus onset timing on the cortical activity during the attendphysical conditions in both MPA domains. All of the comparisons outlined in the following sections were significant at p < .05.

Effects of SOA in Attend-Physical Task
Theta-band latency differences. The AP(P 1st ) condition elicited theta ERS significantly later than the AP(S) and AP( V 1st ) conditions. Specifically, in both the left and right motor areas (C and F, respectively), AP(S) elicited greater theta ERS from stimulus onset to ∼300 msec post stimulus and AP(P 1st ) elicited greater theta ERS later in the trial, from ∼500 msec to 600 msec post stimulus. Likewise, AP( V 1st ) elicited greater theta ERS from stimulus onset to ∼400 msec post stimulus and AP(P 1st ) elicited greater theta ERS from ∼500 msec to 600 msec post stimulus.
Beta-band power differences. In the left and right motor areas (C and F, respectively), AP(P 1st ) elicited the strongest beta ERD, compared with AP(S) (∼550-1500 msec poststimulus) and AP( V 1st ) (∼500-1500 msec poststimulus), and AP(S) elicited stronger alpha ERD than AP( V 1st ) (∼800-1200 msec poststimulus). Thus, in general, beta ERD AP(P 1st ) > AP(S) > AP(V 1st ). AP( V 1st ), and AV(P 1st ) vs. AP(P 1st )). Similar results were found in the left motor area. All of the comparisons outlined in the following sections were significant at p < .05.  (msec) across the x axis and frequency of the EEG signal along the y axis. Panels B (left) and E (right) show the associated ERSP plots for the attend-visual visual first (AV( V 1st )), attend-visual simultaneous (AVS), and attend-visual physical first (AV(P 1st )) conditions. Panels C (left motor area) and F (right motor area) show the bootstrapped comparisons ( p < .05) between each possible pair of conditions. ERS power is depicted in yellow/red, ERD power is depicted in blue, and green shows no difference in spectral power compared with baseline. MPA motor areas: (A and D) 3-D representations of the brain with the yellow region representing the left motor area and the blue region representing the right motor area. The greatest concentration of dipoles in left and right regions was consistent with premotor and SMAs (BA 6). (B and E) ERSP plots for each condition. (C and F) Bootstrapped comparisons examine each possible pair of conditions; frequency and time of significant comparisons are shown by the colored boxes. Both left and right motor areas show similar conditional differences. Theta: AV( V 1st ) and AV(S) elicits theta ERS significantly earlier than AV(P 1st ) (white boxes). Alpha: AV(P 1st ) elicits stronger alpha ERD than AV(S) and AV( V 1st ), and AV(S) elicits strong alpha ERD than AV( V 1st ) (black boxes). Beta: AV(P 1st ) elicits stronger beta ERD than AV(S) and AV( V 1st ), and AV(S) elicits stronger beta ERD than AV( V 1st ) (brown boxes). Gamma: Differences in gamma existed only in the right motor area: The AV( V 1st ) condition elicited significantly stronger gamma ERS than AV(P 1st ) (red boxes).

Effects of Attention Allocation across SOA Conditions
Theta-band power differences. AV(S) elicited a more powerful theta ERS than AP(S) from ∼250 msec to 400 msec post stimulus (C).
Beta-band power differences. In the right motor area (A), AP(P 1st ) elicited a stronger beta ERD than AV(P 1st ) from ∼550-1500 msec poststimulus (B), AV(S) elicited a stronger beta ERS than AP(S) from ∼800 msecend  (msec) across the x axis and frequency of the EEG signal along the y axis. Panels B (left) and E (right) show the associated ERSP plots for the attend-physical visual first (AP( V 1st )), attend-physical simultaneous (APS), and attend-physical physical first (AP(P 1st )) conditions. Panels C (left motor area) and F (right motor area) show the bootstrapped comparisons ( p < .05) between each possible pair of conditions. ERS power is depicted in yellow/red, ERD power is depicted in blue, and green shows no difference in spectral power compared with baseline. MPA motor areas: (A) and (D) show 3-D representations of the brain, with the yellow region representing the left motor area and the blue region representing the right motor area. The greatest concentration of dipoles in the left and right regions were consistent with premotor and SMAs (BA 6). (B and E) ERSP plots for each condition. (C and F) Bootstrapped comparisons examine each possible pair of conditions; frequency and time of significant comparisons are shown by the colored boxes. Both left and right motor areas show similar conditional differences. Theta: AP( V 1st ) and AV(S) elicits theta ERS significantly earlier than AP(P 1st ) (white boxes). Alpha: AP(P 1st ) elicits stronger alpha ERD than AP(S) and AP( V 1st ), and AP(S) elicits strong alpha ERD than AP( V 1st ) (black boxes). Beta: AP(P 1st ) elicits stronger beta ERD than AP(S) and AP( V 1st ), and AP(S) elicits stronger beta ERD than AP( V 1st ) (brown boxes).
of trial (C), and AV( V 1st ) elicited more powerful beta ERS than AP(V 1st ) from ∼700 msecend of trial (D).

DISCUSSION
Behavioral research has demonstrated a temporal binding window for visual-vestibular integration, in which multisensory integration affects heading perception, temporal order judgements, and attention allocation (e.g., Rodriguez & Crane, 2021;Shayman et al., 2018). Research exploring the cortical processes underlying this temporal window is currently scarce. To better understand the online processes related to multisensory temporal binding, we must look to literature focused on the integration of other senses, such as audiovisual, or visuotactile integration. Studies such as Senkowski et al. (2007) have demonstrated that the closer audiovisual stimuli are presented temporally, the more powerful the elicited feature-binding gamma ERS response. Past multisensory research has demonstrated a Gaussian integration window, in which integration breaks at a temporal asynchrony specific to the senses being integrated (e.g., Rodriguez & Crane, 2021). The present study explored how EEG oscillations related to attention and multisensory weighting in selfmotion perception (theta, alpha, and beta; Townsend et al., 2019Townsend et al., , 2022, and multisensory feature binding (gamma; Senkowski et al., 2007) were affected by varying conditions of SOA. All differences in cortical activity discussed are projected from the motor area (likely including integrative areas such as ventral intraparietal area and medial superior temporal area) based on the MPA, which identified ROIs across participants.

The Effects of Timing Onset within an Attended Modality
Recent research by Townsend et al. (2019Townsend et al. ( , 2022 showed that theta, alpha, and beta oscillations reveal brain networks involved in the perception of self-motion. Moreover, the power of these individual oscillations changed show the associated ERSP plots for the attend-physical and attend-visual conditions at each level of the SOA condition, and the bootstrapped comparisons ( p < .05) between each pair of conditions. ERS power is depicted in yellow/red, ERD power is depicted in blue, and green shows no difference in spectral power compared with baseline. MPA right motor area: (A) 3-D representations of the brain with the blue region representing the right motor area. The greatest concentration of dipoles in right region was consistent with premotor and SMAs (BA 6). (B, C, and D) Bootstrapped comparisons examine each possible pair of conditions; frequency and time of significant comparisons are shown by the colored boxes. Theta: AV(S) elicits stronger theta ERS than AP(S) (C; white box). Alpha: AV(S) elicits stronger alpha ERD than AP(S) (C), and AP( V 1st ) elicits stronger alpha ERD than AV( V 1st ) (D; black boxes). Beta: AP(P 1st ) elicits stronger beta ERD than AV(P 1st ) (B), AV(S) elicits stronger beta ERS than AP(S) (C), and AV( V 1st ) elicits stronger beta ERS than AP( V 1st ) (D; brown boxes). dynamically depending on which sensory inputs were attended to. Taken together, our two previous studies demonstrated that the beta band is most sensitive to changes in visual-vestibular weighting. Specifically, these studies showed that a strong beta ERS is an electrophysiological signature of heavy visual weighting, and a strong beta ERD is a signature of vestibular weighting.
The current study revealed changes in the same spectral bands as the previously mentioned studies and contributed additional key insights to the understanding of selfmotion perception. One robust result that we observed was when presenting an attended motion cue before an ignored cue, the power of the beta oscillation associated with weighting bias toward the attended modality (ERS for visual and ERD for vestibular) was greater than during simultaneous presentation of the attended and ignored cues. This result suggests that the power of weightingrelated beta oscillations during self-motion perception is also sensitive to the timing of the onset, and not just attention allocation. Regardless of which modality is being attended to, the earlier the attended motion cue is presented in relation to the ignored cue, the more powerful the weighting-related ERSP. The inverse was true when the ignored cues were presented before the attended cues. Beta ERS was less powerful in the AV(P 1st ) condition versus AV(S), and beta ERD was less powerful in the AP( V 1st ) condition versus AP(S).
The beta cycle has long been thought to reflect an initiation and termination of motor output (for a review, see Kilavik, Zaepffel, Brovelli, MacKay, & Riehle, 2013). Contrary to this hypothesis, Townsend et al. (2019Townsend et al. ( , 2022 demonstrated a beta rebound during passive full-body motion that was induced by attention, and suggested that beta oscillations during motor processing may actually reflect perceptual weighting of the visual, vestibular, and proprioceptive systems. The beta rebound may reflect the inhibition of processing the physical-motion stimuli, considering visual-vestibular integration is a subadditive process. Subadditive inhibition typically occurs during integration when there is a discrepancy in the reliability of multiple sensory inputs (Angelaki, Gu, & DeAngelis, 2009). The Townsend et al. (2022) study showed that participants performed the heading discrimination task at 99% accuracy in both visual-and physical-motion only conditions (the same motion stimuli as the current study). Considering there were likely no significant differences in reliability between the two sensory inputs, we believe that the temporal advantage caused by the SOA led to strong inhibitory responses during integration. Our behavioral and EEG results fall in line with Townsend et al. (2019Townsend et al. ( , 2022. Similar to our previous research, the average of participants' accuracy on the heading discrimination task ranged between conditions from 98-100%. We believe the oscillatory differences in the beta band between the stimulus onset timing conditions may be a product of the perceptual weights being changed because of the SOA. For example, the processing of the visual stimulus during the AV( V 1st ) condition began 100 msec before the processing of the physical-motion stimulus. This perceptual head start could have increased the weighting in favor of the visual stimulus, more so than in the AV(S) condition. A similar weighting bias may have taken place during the attend-physical conditions, as we found similar results (but in beta ERD). These power differences in ERSP did not result in differences in accuracy, however (attendvisual 99% accuracy, attend-physical 95% accuracy). We believe that the tasks may not have been sensitive enough to capture correlations between behavioral differences and oscillatory power.
RTs, on the other hand, were affected by the SOA. Keeping in mind that RTs were measured from the onset of the to-be-attended stimulus, RTs were fastest when the visualmotion cues were presented first regardless of whether visual or physical cues were attended. In contrast, RTs were slowest when the physical-motion cues were presented first, regardless of which cue was attended. The visual system is dominant over the vestibular system, as reported in many studies (e.g., Angelaki et al., 2009), and it is not surprising that we see this RT effect with 100-msec SOAs. Visual cues also lead to faster perceptual processing compared with vestibular cues (Barnett-Cowan & Harris, 2013), and the visual cue would have provided stronger priming than the vestibular cue when attention was directed to the opposite cue. Thus, RTs benefited more when the visual-motion cue was presented first. The present study clearly demonstrates that the timing of stimulus onset is a critical component of the visualvestibular weighting process and is indexed by dynamic changes in the beta band.

The Interaction of Stimulus Timing and Attentional Selection
Not only did we find that the timing of stimulus onsets affected ERSP, we also found an interaction between the timing of onsets and attention allocation. This result has a direct application to pilot training; for example, current policies of Transport Canada and Federal Aviation Administration require physical cues to motion to precede visual cues to motion during pilot simulator training. Pilots are trained to attend to visual instruments and ignore vestibular inputs caused by forces such as turbulence, to avoid spatial disorientation (Braithwaite, 1997). One question that arises from this practice is how the temporal asynchrony and selective attention interact to affect pilots' multisensory processing. We compared the visual-versus the physical-motion conditions at each SOA condition. Our comparison of AP(S) versus AV(S) was a replication of a condition in Townsend et al. (2019), and we found similar results in the present study, the most important observation being stronger beta ERS in attend-visual conditions and stronger beta ERD in attend-physical conditions. This comparison acted as a baseline, whereas the other two comparisons presented novel findings.
The comparisons AP(P 1st ) versus AV(P 1st ) (contrasting attention conditions when the physical stimulus onset first), and AP(V 1st ) versus AV(V 1st ) (contrasting attention conditions when the visual stimulus onset first) demonstrated an interaction of attention allocation and SOA in the beta band. When the physical-motion cue was presented 100 msec before the visual cue, there were fewer ERSP differences between AP(P 1st ) versus AV(P 1st ), compared with the baseline comparison. Most notably, the typical beta rebound elicited by attention to the visual-motion cue was not present in the AV(P 1st ) condition. Based on the findings of Townsend et al. (2019Townsend et al. ( , 2022, the lack of a beta rebound in the AV(P 1st ) condition suggests that presenting the physical-motion cue before the visual-motion cue resulted in greater weighting of vestibular signals than if the motion cues were presented simultaneously. This finding is relevant to simulator training for pilots. If the vestibular cue to motion is presented before the visual cue, it may disrupt the operator's ability to down-weight potentially disorienting vestibular cues that pilots are trained to ignore.
The lack of a beta rebound in the AV(P 1st ) condition resulted in relatively little difference in ERSP between AP(P 1st ) versus AV(P 1st ). However, when the visual-motion cue was presented 100 msec before the physical-motion cue, there was a robust beta ERS in the AV( V 1st ) condition versus a beta ERD in the AP( V 1st ) condition. This analysis revealed that visual-vestibular weighting is more sensitive to changes in the onset timing of the visual cues to motion than the vestibular cues. This finding is supported by Barnett-Cowan and Harris (2013), who demonstrated that perception of visual stimuli is faster than perception of vestibular stimuli. Considering the visual cue naturally has a temporal advantage (during simultaneous presentation), it is likely that the vestibular cue would need to be presented more than 100 msec before the visual cue to create the robust ERSP differences that were demonstrated between the conditions of attention allocation when the visual cue was presented first.

Feature-binding Gamma ERS in Visual-Vestibular Integration
We examined gamma ERS under varying conditions of SOA to test the temporal correlation hypothesis (Engel, Fries, & Singer, 2001;Singer & Gray, 1995) in the context of visual-vestibular integration. This hypothesis posits that synchronization of gamma-band oscillations is a key mechanism for integration across distributed cortical networks. Evidence supporting this hypothesis has been demonstrated in multiple studies (e.g., Senkowski et al., 2007;Sakowitz, Quiroga, Schürmann, & Başar, 2001) that typically focus on audiovisual integration. For example, Senkowski et al. (2007) presented human participants with audiovisual stimuli with varying degrees of temporal asynchrony and required them to attend to one modalityspecific stimuli while ignoring the other. They found that gamma ERS was not significantly different between modalities but, for both modalities, significantly stronger gamma ERS was elicited when temporal asynchrony was 25 msec or less, compared with longer SOAs. In the present study, the temporal correlation hypothesis predicts that the simultaneous conditions (AP(S) and AV(S)) elicit stronger gamma ERS compared with the V1st and P1st conditions. Our results do not support this hypothesis. The present study only found differences in the gamma band when comparing the AV( V 1st ) and AV(P 1st ) conditions, such that AV( V 1st ) elicited stronger gamma ERS than AV(P 1st ). We are currently unaware of any literature directly explaining this finding. We offer two possible conclusions for our results. First, visual-vestibular integration does not rely on gamma ERS to synchronize modalityspecific information across cortical networks. This facilitation of gamma ERS could be specific to superadditive integration processes (e.g., audiovisual integration; Dias, McClaskey, & Harris, 2021) as opposed to subadditive integration processes (e.g., visual-vestibular integration; Angelaki et al., 2009). Or second, visual-vestibular integration has a broader temporal window than 100 msec for gamma facilitation (compared with the Senkowski et al., 2007, temporal window of 25 msec), and therefore our experimental design was not sensitive enough to detect differences in gamma ERS because of SOA. A broader temporal window for visual-vestibular integration would be consistent with behavioral research (Rodriguez & Crane, 2021) and research demonstrating that perception for vestibular inputs being relatively slower than other senses (Barnett-Cowan & Harris, 2013). More research needs to be conducted to better understand the role of stimulus timing in visual-vestibular feature binding.

Limitation and Future Directions
Our heading discrimination task required participants to push a button as quickly as possible to make a heading judgment. It is possible that the preparation and execution of thumb movements during the button press contributed to the recorded EEG signal in the motor areas. Pilot studies revealed that participants had a tendency to only attend to visual cues to motion unless they were told that some physical-motion cues were spatially incongruent to visual-motion cues. Collecting RTs during the heading judgment task was important to ensure that participants attended to the correct motion cues to elicit the appropriate cortical activity. Our previous research (Townsend et al., 2019(Townsend et al., , 2022 demonstrated that RT data were diagnostic of attention allocation, such that visual headings were judged faster than physical headings. The somatosensory system detects pressure and stretch on the skin, muscles, and joints during self-motion (Lackner, 1992). The forces generated by acceleration that produce vestibular or proprioceptive cues would be strong signals of self-motion perception; however, forces generated by the acceleration of our motion simulator would have also stimulated receptors in the back, seat, and feet of the seated participants. Although there is evidence from patients with spinal lesions that the somatosensory system does not contribute significantly to our perception of self-motion (Walsh, 1961), we cannot completely rule out the somatosensory system's contribution to our EEG signal projecting from the motor areas.
Functional neuroimaging studies exploring the neural correlates of visual motion perception typically use optic flow to elicit cortical responses to vection, or the illusion of inertial motion generated by visual-only stimuli. Some studies have compared coherent optic flow to control stimuli such as random (incoherent) dot motions (e.g., Cardin & Smith, 2010), static dot patterns (e.g., Deutschländer et al., 2004), or spatially scrambled versions of the original self-motion stimulus (e.g., Barry et al., 2014). In these studies, participants are not physically moved, so researchers commonly rely on self-report data to determine whether participants experienced the vection illusion. We did not collect self-report data to determine whether participants experienced vection from our visual-motion cues in the present study. Therefore, we cannot be completely certain that our visualmotion stimuli would have elicited vection on their own. However, a large body of research has shown that visually induced vection is strengthened when paired with vestibular stimulation (e.g., Gallagher, Dowsett, & Ferrè, 2019;Weech & Troje, 2017;Johnson, Sunahara, & Landolt, 1999). Our visual-and physical-motion stimuli were developed to combine for an immersive experience of self-motion that is similar to environments used in aviation and driving research and training.
Our research can be applied to the clinical space to better understand pathologies of self-motion perception and visual-vestibular integration. Patients with pathologies such as Mal de Débarquement Syndrome ( Van Ombergen, Van Rompaey, Maes, Van de Heyning, & Wuyts, 2016), Persistent Postural-Perceptual Dizziness (Popkirov, Staab, & Stone, 2018), and Parkinson's disease ( Yakubovich et al., 2020) show lower thresholds for self-motion perception. For example, a recent study has shown that, compared with healthy, age-matched controls, Parkinson's disease patients perform worse on heading judgment tasks because of overweighting of impaired visual-motion cues (Yakubovich et al., 2020). If we can establish electrophysiological biomarkers of the healthy versus impaired self-motion perception, we will develop a better understanding of the integration and motor impairments that are common in pathologies such as Parkinson's disease. Identification of these biomarkers in the prediagnostic phase of the disease could lead to a greater time window for possible preventative measures and earlier treatments (Noyce, Lees, & Schrag, 2016).

Conclusion
The present study examined cortical activity elicited in response to self-motion cues that varied in attention allocation and stimulus onset synchrony. There were two main findings. First, SOA produced robust differences in cortical activity during attention to both visual and physical motion. The electrophysiological signatures of visual (strong beta ERS) versus vestibular (strong beta ERD) weighting bias were enhanced when the attended motion cue was presented 100 msec before the ignored cue. When comparing across conditions of attention allocation, presenting the visual-motion cue first created more robust conditional differences than when physical-motion cues were presented first. These results demonstrate that the timing of visual-vestibular stimuli plays a critical role in multisensory weighting during self-motion perception, and that this weighting process is more sensitive to temporal changes in visual stimuli compared with vestibular stimuli. Second, contrary to the findings of several audiovisual and visuotactile studies, the temporal synchrony of visual-and physical-motion cues did not elicit gamma ERS beyond baseline. It is possible that the 100-msec SOA was not long enough to elicit these hypothesized differences. It could also be the case that visual-vestibular integration does not elicit processes indexed by gamma ERS.

Data Availability Statement
The data and code for all analyses are available online at https://github.com/ bentownsend11/Stimulus-onset -asynchrony-affects-attention-related-ERSP-in-self-motion -perception.